I'm a Ph.D. candidate at Harvard University in Dr. John Rinn's lab
I use new experimental techniques to study how gene regulation is different between species
You might have heard that humans share 99% of their DNA with chimpanzees, which are our closest relative. In fact, this statistic only applies to our "coding" DNA -- the DNA that eventually becomes proteins -- and our coding DNA only accounts for 1-3% of the total 3 billion base pairs of human DNA. So, what is the rest of our "non-coding" DNA doing? Scientists believe that most of this non-coding DNA is responsible for gene regulation; that is, it tells genes when and where to turn on. Interestingly, despite the fact that the human non-coding genome is quite diverged from the chimpanzee non-coding genome, patterns of gene expression between the two species are relatively conserved. That means that even though the primary DNA sequence responsible for turning a liver gene on has changed over evolution, the liver gene is still a liver gene in both humans and chimpanzees. How these DNA sequences can be so diverged but retain their functionality is unknown. In my research, I am studying this phenomenon (known as "transcriptional evolution") using humans and mice as models. Further, instead of studying just protein-coding genes, I am also studying non-coding genes (genes that make RNA that never becomes a protein). These "non-coding RNAs" are a novel, mysterious, and poorly understood type of gene, and by including them in my analyses I hope to shed light on their function. I am taking DNA sequences responsible for turning on a set of genes from both humans and mice and attaching them to a read-out called a reporter. I can do this with hundreds of thousands of sequences at once -- making the assay "massively parallel". I am mutating each DNA sequence and examining the effects that various mutations have on the reporter output. By doing this, I will be able to determine the DNA motifs that are important in the transcription of human genes and mouse genes -- and determine what patterns exist, if any, across the two species. This research will help us understand the processes underlying evolution and bring us closer to answering that ever-present question: "what makes us human?"
Abstract: The mammalian radiation has corresponded with rapid changes in noncoding regions of the genome, but we lack a comprehensive understanding of regulatory evolution in mammals. Here, we track the evolution of promoters and enhancers active in liver across 20 mammalian species from six diverse orders by profiling genomic enrichment of H3K27 acetylation and H3K4 trimethylation. We report that rapid evolution of enhancers is a universal feature of mammalian genomes. Most of the recently evolved enhancers arise from ancestral DNA exaptation, rather than lineage-specific expansions of repeat elements. In contrast, almost all liver promoters are partially or fully conserved across these species. Our data further reveal that recently evolved enhancers can be associated with genes under positive selection, demonstrating the power of this approach for annotating regulatory adaptations in genomic sequences. These results provide important insight into the functional genetics underpinning mammalian regulatory evolution.
Pub.: 31 Jan '15, Pinned: 29 Jun '17
Abstract: The functional consequences of genetic variation in mammalian regulatory elements are poorly understood. We report the in vivo dissection of three mammalian enhancers at single-nucleotide resolution through a massively parallel reporter assay. For each enhancer, we synthesized a library of >100,000 mutant haplotypes with 2-3% divergence from the wild-type sequence. Each haplotype was linked to a unique sequence tag embedded within a transcriptional cassette. We introduced each enhancer library into mouse liver and measured the relative activities of individual haplotypes en masse by sequencing the transcribed tags. Linear regression analysis yielded highly reproducible estimates of the effect of every possible single-nucleotide change on enhancer activity. The functional consequence of most mutations was modest, with ∼22% affecting activity by >1.2-fold and ∼3% by >2-fold. Several, but not all, positions with higher effects showed evidence for purifying selection, or co-localized with known liver-associated transcription factor binding sites, demonstrating the value of empirical high-resolution functional analysis.
Pub.: 01 Mar '12, Pinned: 29 Jun '17
Abstract: While long intergenic noncoding RNAs (lincRNAs) and mRNAs share similar biogenesis pathways, these transcript classes differ in many regards. LincRNAs are less evolutionarily conserved, less abundant, and more tissue-specific, suggesting that their pre- and post-transcriptional regulation is different from that of mRNAs. Here, we perform an in-depth characterization of the features that contribute to lincRNA regulation in multiple human cell lines. We find that lincRNA promoters are depleted of transcription factor (TF) binding sites, yet enriched for some specific factors such as GATA and FOS relative to mRNA promoters. Surprisingly, we find that H3K9me3-a histone modification typically associated with transcriptional repression-is more enriched at the promoters of active lincRNA loci than at those of active mRNAs. Moreover, H3K9me3-marked lincRNA genes are more tissue-specific. The most discriminant differences between lincRNAs and mRNAs involve splicing. LincRNAs are less efficiently spliced, which cannot be explained by differences in U1 binding or the density of exonic splicing enhancers but may be partially attributed to lower U2AF65 binding and weaker splicing-related motifs. Conversely, the stability of lincRNAs and mRNAs is similar, differing only with regard to the location of stabilizing protein binding sites. Finally, we find that certain transcriptional properties are correlated with higher evolutionary conservation in both DNA and RNA motifs and are enriched in lincRNAs that have been functionally characterized.
Pub.: 09 Dec '16, Pinned: 29 Jun '17