Indexed on: 18 Jul '17Published on: 18 Jul '17Published in: Nature Methods
In read cloud approaches, microfluidic partitioning of long genomic DNA fragments and barcoding of shorter fragments derived from these fragments retains long-range information in short sequencing reads. This combination of short reads with long-range information represents a powerful alternative to single-molecule long-read sequencing. We develop Genome-wide Reconstruction of Complex Structural Variants (GROC-SVs) for SV detection and assembly from read cloud data and apply this method to Illumina-sequenced 10x Genomics sarcoma and breast cancer data sets. Compared with short-fragment sequencing, GROC-SVs substantially improves the specificity of breakpoint detection at comparable sensitivity. This approach also performs sequence assembly across multiple breakpoints simultaneously, enabling the reconstruction of events exhibiting remarkable complexity. We show that chromothriptic rearrangements occurred before copy number amplifications, and that rates of single-nucleotide variants and SVs are not correlated. Our results support the use of read cloud approaches to advance the characterization of large and complex structural variation.