Paired-end mapping reveals extensive structural variation in the human genome.

Research paper by Jan O JO Korbel, Alexander Eckehart AE Urban, Jason P JP Affourtit, Brian B Godwin, Fabian F Grubert, Jan Fredrik JF Simons, Philip M PM Kim, Dean D Palejev, Nicholas J NJ Carriero, Lei L Du, Bruce E BE Taillon, Zhoutao Z Chen, Andrea A Tanzer, A C Eugenia AC Saunders, Jianxiang J Chi, et al.

Indexed on: 29 Sep '07Published on: 29 Sep '07Published in: Science


Structural variation of the genome involves kilobase- to megabase-sized deletions, duplications, insertions, inversions, and complex combinations of rearrangements. We introduce high-throughput and massive paired-end mapping (PEM), a large-scale genome-sequencing method to identify structural variants (SVs) approximately 3 kilobases (kb) or larger that combines the rescue and capture of paired ends of 3-kb fragments, massive 454 sequencing, and a computational approach to map DNA reads onto a reference genome. PEM was used to map SVs in an African and in a putatively European individual and identified shared and divergent SVs relative to the reference genome. Overall, we fine-mapped more than 1000 SVs and documented that the number of SVs among humans is much larger than initially hypothesized; many of the SVs potentially affect gene function. The breakpoint junction sequences of more than 200 SVs were determined with a novel pooling strategy and computational analysis. Our analysis provided insights into the mechanisms of SV formation in humans.