A Draft Human Pangenome Reference

Wen‐Wei Liao(James S. McDonnell Foundation), Mobin Asri(University of California, Santa Cruz), Jana Ebler(Heinrich Heine University Düsseldorf), Daniel Doerr(Heinrich Heine University Düsseldorf), Marina Haukness(University of California, Santa Cruz), Glenn Hickey(University of California, Santa Cruz), Shuangjia Lu(Yale University), Julian Lucas(University of California, Santa Cruz), Jean Monlong(University of California, Santa Cruz), Haley Abel(Washington University in St. Louis), Silvia Buonaiuto(Institute of Genetics and Biophysics), Xian Chang(University of California, Santa Cruz), Haoyu Cheng(Harvard University), Justin Chu(Dana-Farber Cancer Institute), Vincenza Colonna(University of Tennessee Health Science Center), Jordan M. Eizenga(University of California, Santa Cruz), Xiaowen Feng(Harvard University), Christian Fischer(University of Tennessee Health Science Center), Robert S. Fulton(James S. McDonnell Foundation), Shilpa Garg(University of Copenhagen), Cristian Groza(McGill University), Andrea Guarracino(Human Technopole), William T. Harvey(University of Washington), Simon Heumos(University of Tübingen), Kerstin Howe(Wellcome Sanger Institute), Miten Jain(Northeastern University), Tsung-Yu Lu(University of Southern California), Charles Markello(University of California, Santa Cruz), Fergal J. Martin(European Bioinformatics Institute), Matthew W. Mitchell(Coriell Institute For Medical Research), Katherine M. Munson(University of Washington), Moses Njagi Mwaniki(University of Pisa), Adam M. Novak(University of California, Santa Cruz), Hugh E. Olsen(University of California, Santa Cruz), Trevor Pesout(University of California, Santa Cruz), David Porubskỳ(University of Washington), Pjotr Prins(University of Tennessee Health Science Center), Jonas A. Sibbesen(University of Copenhagen), Chad Tomlinson(James S. McDonnell Foundation), Flavia Villani(University of Tennessee Health Science Center), Mitchell R. Vollger(University of Washington), Guillaume Bourque(Kyoto University), Mark Chaisson(University of Southern California), Paul Flicek(European Bioinformatics Institute), Adam M. Phillippy(National Institutes of Health), Justin M. Zook(National Institute of Standards and Technology), Evan E. Eichler(Howard Hughes Medical Institute), David Haussler(Howard Hughes Medical Institute), Erich D. Jarvis(Howard Hughes Medical Institute), Karen H. Miga(University of California, Santa Cruz), Ting Wang(Washington University in St. Louis), Erik Garrison(University of Tennessee Health Science Center), Tobias Marschall(Heinrich Heine University Düsseldorf), Ira M. Hall(Yale University), Heng Li(Harvard University), Benedict Paten(University of California, Santa Cruz)
bioRxiv (Cold Spring Harbor Laboratory)
July 9, 2022
Cited by 74Open Access
Full Text

Abstract

Abstract The Human Pangenome Reference Consortium (HPRC) presents a first draft human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals. These assemblies cover more than 99% of the expected sequence and are more than 99% accurate at the structural and base-pair levels. Based on alignments of the assemblies, we generated a draft pangenome that captures known variants and haplotypes, reveals novel alleles at structurally complex loci, and adds 119 million base pairs of euchromatic polymorphic sequence and 1,529 gene duplications relative to the existing reference, GRCh38. Roughly 90 million of the additional base pairs derive from structural variation. Using our draft pangenome to analyze short-read data reduces errors when discovering small variants by 34% and boosts the detected structural variants per haplotype by 104% compared to GRCh38-based workflows, and by 34% compared to using previous diversity sets of genome assemblies.


Related Papers

No related papers found

Powered by citation graph analysis