A pinboard by
Rebecca Elyanow

PhD Student, Brown University


Identifying genomic rearrangements in cancer genomes

Cancer is a disease caused by mutations in the genome that promote cell proliferation and growth. Often, these mutations are not single nucleotide changes but large-scale genomic rearrangements, called structural variants. Such rearrangements include deletions, duplications, or inversions of entire genes. These structural variants are difficult to identify using current sequencing technologies because DNA must be fragmented into small pieces before being sequenced. The resulting fragments of DNA are pieced together, like a genomic puzzle, by aligning them to the human reference genome. I developed an probabilistic model to detect regions of a cancer genome affected by structural variation. The model utilizes a new sequencing technology, linked-read sequencing, which provides long-range information by labelling DNA fragments which originate from the same long DNA molecule.


The consequences of structural genomic alterations in humans: genomic disorders, genomic instability and cancer.

Abstract: Over the last decade or so, sophisticated technological advances in array-based genomics have firmly established the contribution of structural alterations in the human genome to a variety of complex developmental disorders, and also to diseases such as cancer. In fact, multiple 'novel' disorders have been identified as a direct consequence of these advances. Our understanding of the molecular events leading to the generation of these structural alterations is also expanding. Many of the models proposed to explain these complex rearrangements involve DNA breakage and the coordinated action of DNA replication, repair and recombination machinery. Here, and within the context of Genomic Disorders, we will briefly overview the principal models currently invoked to explain these chromosomal rearrangements, including Non-Allelic Homologous Recombination (NAHR), Fork Stalling Template Switching (FoSTeS), Microhomology Mediated Break-Induced Repair (MMBIR) and Breakage-fusion-bridge cycle (BFB). We will also discuss an unanticipated consequence of certain copy number variations (CNVs) whereby the CNVs potentially compromise fundamental processes controlling genomic stability including DNA replication and the DNA damage response. We will illustrate these using specific examples including Genomic Disorders (DiGeorge/Veleocardiofacial syndrome, HSA21 segmental aneuploidy and rec (3) syndrome) and cell-based model systems. Finally, we will review some of the recent exciting developments surrounding specific CNVs and their contribution to cancer development as well as the latest model for cancer genome rearrangement; 'chromothripsis'.

Pub.: 02 Aug '11, Pinned: 30 Jun '17

Direct determination of diploid genome sequences.

Abstract: Determining the genome sequence of an organism is challenging, yet fundamental to understanding its biology. Over the past decade, thousands of human genomes have been sequenced, contributing deeply to biomedical research. In the vast majority of cases, these have been analyzed by aligning sequence reads to a single reference genome, biasing the resulting analyses, and in general, failing to capture sequences novel to a given genome. Some de novo assemblies have been constructed free of reference bias, but nearly all were constructed by merging homologous loci into single "consensus" sequences, generally absent from nature. These assemblies do not correctly represent the diploid biology of an individual. In exactly two cases, true diploid de novo assemblies have been made, at great expense. One was generated using Sanger sequencing, and one using thousands of clone pools. Here, we demonstrate a straightforward and low-cost method for creating true diploid de novo assemblies. We make a single library from ∼1 ng of high molecular weight DNA, using the 10x Genomics microfluidic platform to partition the genome. We applied this technique to seven human samples, generating low-cost HiSeq X data, then assembled these using a new "pushbutton" algorithm, Supernova. Each computation took 2 d on a single server. Each yielded contigs longer than 100 kb, phase blocks longer than 2.5 Mb, and scaffolds longer than 15 Mb. Our method provides a scalable capability for determining the actual diploid genome sequence in a sample, opening the door to new approaches in genomic biology and medicine.

Pub.: 07 Apr '17, Pinned: 29 Jun '17