Resolving the full spectrum of human genome variation using Linked-Reads

Patrick Marks(10X Genomics (United States)), Sarah Garcia(10X Genomics (United States)), Álvaro Martínez Barrio(10X Genomics (United States)), Kamila Belhocine(10X Genomics (United States)), Jorge Bernate(10X Genomics (United States)), Rajiv Bharadwaj(10X Genomics (United States)), Keith P. Bjornson(10X Genomics (United States)), Claudia Catalanotti(10X Genomics (United States)), Josh Delaney(10X Genomics (United States)), Adrian Fehr(10X Genomics (United States)), Ian T. Fiddes(10X Genomics (United States)), Brendan D. Galvin(10X Genomics (United States)), Haynes Heaton(10X Genomics (United States)), Jill Herschleb(10X Genomics (United States)), Christopher M. Hindson(10X Genomics (United States)), Esty Holt(Institute of Cancer Research), Cassandra B. Jabara(10X Genomics (United States)), Susanna Jett(10X Genomics (United States)), Nikka Keivanfar(10X Genomics (United States)), Sofia Kyriazopoulou-Panagiotopoulou(10X Genomics (United States)), Monkol Lek(Broad Institute), Bill K. Lin(10X Genomics (United States)), Adam J. Lowe(10X Genomics (United States)), Shazia Mahamdallie(Institute of Cancer Research), Shamoni Maheshwari(10X Genomics (United States)), Tony Makarewicz(10X Genomics (United States)), Jamie L. Marshall(Broad Institute), Francesca Meschi(10X Genomics (United States)), Christopher J. O'Keefe(10X Genomics (United States)), Heather Ordonez(10X Genomics (United States)), Pranav Patel(10X Genomics (United States)), Andrew Price(10X Genomics (United States)), Ariel Royall(10X Genomics (United States)), Elise Ruark(Institute of Cancer Research), Sheila Seal(Institute of Cancer Research), Michael Schnall-Levin(10X Genomics (United States)), Preyas Shah(10X Genomics (United States)), David Stafford(10X Genomics (United States)), Stephen R. Williams(10X Genomics (United States)), Indira Wu(10X Genomics (United States)), Andrew Wei Xu(10X Genomics (United States)), Nazneen Rahman(Institute of Cancer Research), Daniel G. MacArthur(Broad Institute), Deanna M. Church(10X Genomics (United States))
Genome Research
March 20, 2019
Cited by 312Open Access
Full Text

Abstract

Large-scale population analyses coupled with advances in technology have demonstrated that the human genome is more diverse than originally thought. To date, this diversity has largely been uncovered using short-read whole-genome sequencing. However, these short-read approaches fail to give a complete picture of a genome. They struggle to identify structural events, cannot access repetitive regions, and fail to resolve the human genome into haplotypes. Here, we describe an approach that retains long range information while maintaining the advantages of short reads. Starting from ∼1 ng of high molecular weight DNA, we produce barcoded short-read libraries. Novel informatic approaches allow for the barcoded short reads to be associated with their original long molecules producing a novel data type known as “Linked-Reads”. This approach allows for simultaneous detection of small and large variants from a single library. In this manuscript, we show the advantages of Linked-Reads over standard short-read approaches for reference-based analysis. Linked-Reads allow mapping to 38 Mb of sequence not accessible to short reads, adding sequence in 423 difficult-to-sequence genes including disease-relevant genes STRC , SMN1 , and SMN2 . Both Linked-Read whole-genome and whole-exome sequencing identify complex structural variations, including balanced events and single exon deletions and duplications. Further, Linked-Reads extend the region of high-confidence calls by 68.9 Mb. The data presented here show that Linked-Reads provide a scalable approach for comprehensive genome analysis that is not possible using short reads alone.


Related Papers