Multi-platform discovery of haplotype-resolved structural variation in human genomes

Mark Chaisson(University of Southern California), Ashley D. Sanders(European Molecular Biology Laboratory), Xuefang Zhao(Harvard University), Ankit Malhotra(Jackson Laboratory), David Porubskỳ(University Medical Center Groningen), Tobias Rausch(European Molecular Biology Laboratory), Eugene J. Gardner(University of Maryland, Baltimore), Oscar L. Rodriguez(Icahn School of Medicine at Mount Sinai), Li Guo(Xi'an Jiaotong University), Ryan L. Collins(Harvard University), Xian Fan(The University of Texas MD Anderson Cancer Center), Jia Wen(University of North Carolina at Charlotte), Robert E. Handsaker(Broad Institute), Susan Fairley(European Bioinformatics Institute), Zev Kronenberg(University of Washington), Xiangmeng Kong(Yale University), Fereydoun Hormozdiari(University of California, Davis), Dillon Lee(University of Utah), Aaron M. Wenger(Pacific Biosciences (United States)), Alex Hastie(BioNano Genomics (United States)), Danny Antaki(University of California San Diego), Peter A. Audano(University of Washington), Harrison Brand(Harvard University), Stuart Cantsilieris(University of Washington), Han Cao(University of California San Diego), Eliza Cerveira(Jackson Laboratory), Chong Chen(The University of Texas MD Anderson Cancer Center), Xintong Chen(University of Maryland, Baltimore), Chen-Shan Chin(Pacific Biosciences (United States)), Zechen Chong(The University of Texas MD Anderson Cancer Center), Nelson T. Chuang(University of Maryland, Baltimore), Christine Lambert(Pacific Biosciences (United States)), Deanna M. Church(10X Genomics (United States)), Laura Clarke(European Bioinformatics Institute), Andrew Farrell(University of Utah), Joey Flores(Illumina (United States)), Timur R. Galeev(Yale University), David U. Gorkin(University of California San Diego), Madhusudan Gujral(University of California San Diego), Victor Guryev(University Medical Center Groningen), Haynes Heaton(10X Genomics (United States)), Jonas Korlach(Pacific Biosciences (United States)), Sushant Kumar(Yale University), Jee Young Kwon(Ewha Womans University), Jong Eun Lee(BioNano Genomics (United States)), Joyce Lee(BioNano Genomics (United States)), Wan‐Ping Lee(Jackson Laboratory), Sau Peng Lee, Shantao Li(Whitney Museum of American Art), Patrick Marks(10X Genomics (United States)), Karine A. Viaud‐Martinez(Illumina (United States)), Sascha Meiers(European Molecular Biology Laboratory), Katherine M. Munson(University of Washington), Fábio C. P. Navarro(Yale University), Bradley J. Nelson(University of Washington), Conor Nodzak(University of North Carolina at Charlotte), Amina Noor(University of California San Diego), Sofia Kyriazopoulou-Panagiotopoulou(10X Genomics (United States)), Andy Wing Chun Pang(BioNano Genomics (United States)), Yunjiang Qiu(University of California San Diego), Gabriel Rosanio(University of California San Diego), Mallory Ryan(Jackson Laboratory), Adrian M. Stütz(European Molecular Biology Laboratory), Diana C.J. Spierings(University Medical Center Groningen), Alistair Ward(University of Utah), AnneMarie E. Welch(University of Washington), Ming Xiao(Drexel University), Wei Xu(10X Genomics (United States)), Chengsheng Zhang(Jackson Laboratory), Qihui Zhu(Jackson Laboratory), Xiangqun Zheng-Bradley(European Bioinformatics Institute), Ernesto Lowy(European Bioinformatics Institute), Sergei Yakneen(European Molecular Biology Laboratory), Steven A. McCarroll(Broad Institute), Goo Jun(The University of Texas Health Science Center at Houston), Li Ding, Chong‐Lek Koh(University of Malaya), Bing Ren(University of California San Diego), Paul Flicek(European Bioinformatics Institute), Ken Chen(The University of Texas MD Anderson Cancer Center), Mark Gerstein(Whitney Museum of American Art), Pui–Yan Kwok(Yale University), Peter M. Lansdorp(BC Cancer Agency), Gábor Marth(University of Utah), Jonathan Sebat(University of California San Diego), Xinghua Shi(University of North Carolina at Charlotte), Ali Bashir(Icahn School of Medicine at Mount Sinai), Kai Ye(Xi'an Jiaotong University), Scott E. Devine(University of Maryland, Baltimore), Michael E. Talkowski(Broad Institute), Ryan E. Mills(University of Michigan), Tobias Marschall(Max Planck Institute for Informatics), Jan O. Korbel(European Bioinformatics Institute), Evan E. Eichler(Howard Hughes Medical Institute), Charles Lee(Ewha Womans University)
bioRxiv (Cold Spring Harbor Laboratory)
September 23, 2017
Cited by 157Open Access
Full Text

Abstract

ABSTRACT The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, and strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three human parent–child trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per human genome. We also discover 156 inversions per genome—most of which previously escaped detection. Fifty-eight of the inversions we discovered intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The method and the dataset serve as a gold standard for the scientific community and we make specific recommendations for maximizing structural variation sensitivity for future large-scale genome sequencing studies.


Related Papers

No related papers found

Powered by citation graph analysis