Haplotype-resolved diverse human genomes and integrated analysis of structural variation

Peter Ebert(Heinrich Heine University Düsseldorf), Peter A. Audano(University of Washington), Qihui Zhu(Jackson Laboratory), Bernardo Rodríguez–Martín(European Molecular Biology Laboratory), David Porubskỳ(University of Washington), Marc Jan Bonder(German Cancer Research Center), Arvis Sulovari(University of Washington), Jana Ebler(Heinrich Heine University Düsseldorf), Weichen Zhou(Washtenaw Community College), Rebecca Serra Mari(Heinrich Heine University Düsseldorf), Feyza Yilmaz(Jackson Laboratory), Xuefang Zhao(Broad Institute), PingHsun Hsieh(University of Washington), Joyce Lee(BioNano Genomics (United States)), Sushant Kumar(Yale University), Jiadong Lin(Xi'an Jiaotong University), Tobias Rausch(European Molecular Biology Laboratory), Yu Chen(University of Alabama at Birmingham), Jingwen Ren(University of Southern California), Martín Santamarina(Universidade de Santiago de Compostela), Wolfram Höps(European Molecular Biology Laboratory), Hufsah Ashraf(Heinrich Heine University Düsseldorf), Nelson T. Chuang(University of Maryland, Baltimore), Xiaofei Yang(Xi'an Jiaotong University), Katherine M. Munson(University of Washington), Alexandra P. Lewis(University of Washington), Susan Fairley(European Bioinformatics Institute), Luke J. Tallon(University of Maryland, Baltimore), Wayne E. Clarke(New York Genome Center), Anna O. Basile(New York Genome Center), Marta Byrska-Bishop(New York Genome Center), André Corvelo(New York Genome Center), Uday S. Evani(New York Genome Center), Tsung-Yu Lu(University of Southern California), Mark Chaisson(University of Southern California), Junjie Chen(Temple University), Chong Li(Temple University), Harrison Brand(Broad Institute), Aaron M. Wenger(Pacific Biosciences (United States)), Maryam Ghareghani(Max Planck Institute for Informatics), William T. Harvey(University of Washington), Benjamin Raeder(European Molecular Biology Laboratory), Patrick Hasenfeld(European Molecular Biology Laboratory), Allison Regier(Washington University in St. Louis), Haley Abel(Washington University in St. Louis), Ira M. Hall(Yale University), Paul Flicek(European Bioinformatics Institute), Oliver Stegle(German Cancer Research Center), Mark Gerstein(Yale University), José M. C. Tubío(Universidade de Santiago de Compostela), Zepeng Mu(University of Chicago), Yang Li(University of Chicago), Xinghua Shi(Temple University), Alex Hastie(BioNano Genomics (United States)), Kai Ye(University of Michigan), Zechen Chong(University of Alabama at Birmingham), Ashley D. Sanders(European Molecular Biology Laboratory), Michael C. Zody(New York Genome Center), Michael E. Talkowski(Broad Institute), Ryan E. Mills(Washtenaw Community College), Scott E. Devine(University of Maryland, Baltimore), Charles Lee(Ewha Womans University), Jan O. Korbel(European Bioinformatics Institute), Tobias Marschall(Heinrich Heine University Düsseldorf), Evan E. Eichler(Howard Hughes Medical Institute)
Science
February 25, 2021
Cited by 797Open Access
Full Text

Abstract

Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci. We identified 107,590 structural variants (SVs), of which 68% were not discovered with short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence). We characterized 130 of the most active mobile element source elements and found that 63% of all SVs arise through homology-mediated mechanisms. This resource enables reliable graph-based genotyping from short reads of up to 50,340 SVs, resulting in the identification of 1526 expression quantitative trait loci as well as SV candidates for adaptive selection within the human population.


Related Papers

No related papers found

Powered by citation graph analysis