A suite of automated sequence analyses reduces the number of candidate deleterious variants and reveals a difference between probands and unaffected siblings.

Research paper by Fangning F Gu, Anchi A Wu, M Grace MG Gordon, Lukas L Vlahos, Shane S Macnamara, Elizabeth E Burke, May C MC Malicdan, David R DR Adams, Cynthia J CJ Tifft, Camilo C Toro, William A WA Gahl, Thomas C TC Markello

Indexed on: 01 Feb '19Published on: 01 Feb '19Published in: Genetics in Medicine


Develop an automated exome analysis workflow that can produce a very small number of candidate variants yet still detect different numbers of deleterious variants between probands and unaffected siblings. Ninety-seven outbred nuclear families from the Undiagnosed Diseases Program/Network included single probands and the corresponding unaffected sibling(s). Single-nucleotide polymorphism (SNP) chip and exome analyses were performed on all, with proband and unaffected sibling considered independently as the target. The total burden of candidate genetic variants was summed for probands and siblings over all considered disease models. Exome analysis workflow include automated programs for ethnicity-matched genotype calling, salvage pathway for Mendelian inconsistency, compound heterozygous recessive detection, BAM file regional curation, population frequency filtering, pedigree-aware BAM file noise evaluation, and exon deletion filtration. This workflow relied heavily on BAM file analysis. A greater average pathogenic variant number was found compared with unaffected siblings. This was significant (p < 0.05) when using published recommended thresholds, and implies that causal variants are retained in many probands' lists. Using Mendelian and non-Mendelian models, this agnostic exome analysis shows a difference between a small group of probands and their unaffected siblings. This workflow produces candidate lists small enough to pursue with laboratory validation.