Pangenomics enables genotyping of known structural variants in 5202 diverse genomes

Jouni Sirén(University of California, Santa Cruz), Jean Monlong(University of California, Santa Cruz), Xian Chang(University of California, Santa Cruz), Adam M. Novak(University of California, Santa Cruz), Jordan M. Eizenga(University of California, Santa Cruz), Charles Markello(University of California, Santa Cruz), Jonas A. Sibbesen(University of California, Santa Cruz), Glenn Hickey(University of California, Santa Cruz), Pi-Chuan Chang(Google (United States)), Andrew Carroll(Google (United States)), Namrata Gupta(Broad Institute), Stacey Gabriel(Broad Institute), Thomas W. Blackwell(University of Michigan), Aakrosh Ratan(University of Virginia), Kent D. Taylor(UCLA Medical Center), Stephen S. Rich(University of Virginia), Jerome I. Rotter(UCLA Medical Center), David Haussler(Howard Hughes Medical Institute), Erik Garrison(University of Tennessee Health Science Center), Benedict Paten(University of California, Santa Cruz)
Science
December 16, 2021
Cited by 409Open Access
Full Text

Abstract

We introduce Giraffe, a pangenome short-read mapper that can efficiently map to a collection of haplotypes threaded through a sequence graph. Giraffe maps sequencing reads to thousands of human genomes at a speed comparable to that of standard methods mapping to a single reference genome. The increased mapping accuracy enables downstream improvements in genome-wide genotyping pipelines for both small variants and larger structural variants. We used Giraffe to genotype 167,000 structural variants, discovered in long-read studies, in 5202 diverse human genomes that were sequenced using short reads. We conclude that pangenomics facilitates a more comprehensive characterization of variation and, as a result, has the potential to improve many genomic analyses.


Related Papers

No related papers found

Powered by citation graph analysis