A pinboard by
Anne O'Donnell

Postdoc, Boston Children's Hospital


Large reference genomic databases for diagnosis, gene discovery and incomplete penetrance evaluation

Identifying disease-causing, pathogenic variants among the sea of benign genetic variation remains a critical challenge of human genomic medicine. Large-scale reference databases such as gnomAD (genome Aggregation Dataset) includes genome and exome data from >135,000 individuals from diverse ancestry groups to provide an unprecedented view of the natural spectrum of human rare genetic variation in both coding and noncoding regions. Application of this dataset has empowered the identification of truly rare variants and regions of the genome that are highly resistant to genetic variation known as constrained regions where a rare variant is more likely to be disease-causing. My research focuses on developing analytical and statistical methods to leverage population frequency, inferred inheritance mode, and local constraint (depletion of functional variation in the general population) to improve the inference of pathogenicity within a rigorous, quantitative framework. My work highlights common pitfalls of working with large reference databases, including handling the presence of somatic variation and incomplete penetrance. We describe how new mutations that arise in the blood of adults (the typical source of DNA for sequencing) can be mistaken for inherited genetic variants, particularly when they provide a growth advantage to blood cells, leading to a larger proportion of blood cells containing the mutation.

I am now using gnomAD to identify non-penetrant individuals, those with a genetic variant that typically causes a Mendelian, pediatric-onset, severe disease but who do not have features of the disease. These individuals are rare so identifying them requires genomic data from thousands of individuals. My goal is to identify molecular mechanisms of incomplete penetrance, which we hypothesize will include common nearby genetic variants that rescue the effect of the pathogenic variant by down-regulating the pathogenic copy of the gene or up-regulating expression of the normal copy of the gene. The study of the genomes of thousands of individuals in the general population empowers our understanding of rare disease, improving our ability to diagnose rare disease, discover new disease genes, and explore the mechanisms of incomplete penetrance.


Human knockouts and phenotypic analysis in a cohort with a high rate of consanguinity.

Abstract: A major goal of biomedicine is to understand the function of every gene in the human genome. Loss-of-function mutations can disrupt both copies of a given gene in humans and phenotypic analysis of such 'human knockouts' can provide insight into gene function. Consanguineous unions are more likely to result in offspring carrying homozygous loss-of-function mutations. In Pakistan, consanguinity rates are notably high. Here we sequence the protein-coding regions of 10,503 adult participants in the Pakistan Risk of Myocardial Infarction Study (PROMIS), designed to understand the determinants of cardiometabolic diseases in individuals from South Asia. We identified individuals carrying homozygous predicted loss-of-function (pLoF) mutations, and performed phenotypic analysis involving more than 200 biochemical and disease traits. We enumerated 49,138 rare (<1% minor allele frequency) pLoF mutations. These pLoF mutations are estimated to knock out 1,317 genes, each in at least one participant. Homozygosity for pLoF mutations at PLA2G7 was associated with absent enzymatic activity of soluble lipoprotein-associated phospholipase A2; at CYP2F1, with higher plasma interleukin-8 concentrations; at TREH, with lower concentrations of apoB-containing lipoprotein subfractions; at either A3GALT2 or NRG4, with markedly reduced plasma insulin C-peptide concentrations; and at SLC9A3R1, with mediators of calcium and phosphate signalling. Heterozygous deficiency of APOC3 has been shown to protect against coronary heart disease; we identified APOC3 homozygous pLoF carriers in our cohort. We recruited these human knockouts and challenged them with an oral fat load. Compared with family members lacking the mutation, individuals with APOC3 knocked out displayed marked blunting of the usual post-prandial rise in plasma triglycerides. Overall, these observations provide a roadmap for a 'human knockout project', a systematic effort to understand the phenotypic consequences of complete disruption of genes in humans.

Pub.: 14 Apr '17, Pinned: 29 Jun '17

Pathogenic ASXL1 somatic variants in reference databases complicate germline variant interpretation for Bohring-Opitz Syndrome.

Abstract: The clinical interpretation of genetic variants has come to rely heavily on reference population databases such as the Exome Aggregation Consortium (ExAC) database. Pathogenic variants in genes associated with severe, pediatric-onset, highly penetrant, autosomal dominant conditions are assumed to be absent or rare in these databases. Exome sequencing of a six-year-old female patient with seizures, developmental delay, dysmorphic features and failure to thrive identified an ASXL1 variant previously reported as causative of Bohring-Opitz syndrome (BOS). Surprisingly, the variant was observed seven times in the ExAC database, presumably in individuals without BOS. Although the BOS phenotype fit, the presence of the variant in reference population databases introduced ambiguity in result interpretation. Review of the literature revealed that acquired somatic mosaicism of ASXL1 variants (including pathogenic variants) during hematopoietic clonal expansion can occur with aging in healthy individuals. We examined all ASXL1 truncating variants in the ExAC database and determined most are likely somatic. Failure to consider somatic mosaicism may lead to the inaccurate assumption that conditions like Bohring-Opitz syndrome have reduced penetrance, or the misclassification of potentially pathogenic variants. This article is protected by copyright. All rights reserved.

Pub.: 24 Feb '17, Pinned: 29 Jun '17

Improving genetic diagnosis in Mendelian disease with transcriptome sequencing.

Abstract: Exome and whole-genome sequencing are becoming increasingly routine approaches in Mendelian disease diagnosis. Despite their success, the current diagnostic rate for genomic analyses across a variety of rare diseases is approximately 25 to 50%. We explore the utility of transcriptome sequencing [RNA sequencing (RNA-seq)] as a complementary diagnostic tool in a cohort of 50 patients with genetically undiagnosed rare muscle disorders. We describe an integrated approach to analyze patient muscle RNA-seq, leveraging an analysis framework focused on the detection of transcript-level changes that are unique to the patient compared to more than 180 control skeletal muscle samples. We demonstrate the power of RNA-seq to validate candidate splice-disrupting mutations and to identify splice-altering variants in both exonic and deep intronic regions, yielding an overall diagnosis rate of 35%. We also report the discovery of a highly recurrent de novo intronic mutation in COL6A1 that results in a dominantly acting splice-gain event, disrupting the critical glycine repeat motif of the triple helical domain. We identify this pathogenic variant in a total of 27 genetically unsolved patients in an external collagen VI-like dystrophy cohort, thus explaining approximately 25% of patients clinically suggestive of having collagen VI dystrophy in whom prior genetic analysis is negative. Overall, this study represents a large systematic application of transcriptome sequencing to rare disease diagnosis and highlights its utility for the detection and interpretation of variants missed by current standard diagnostic approaches.

Pub.: 21 Apr '17, Pinned: 29 Jun '17