A map of human genome variation from population-scale sequencing
Min Hu(Wellcome Sanger Institute), Yuan Chen(Wellcome Sanger Institute), James Stalker(Wellcome Sanger Institute), Richard M. Durbin (Wellcome Sanger Institute), Si Quang Le(Wellcome Sanger Institute), Leopold Parts(Wellcome Sanger Institute), Allison Coffey(Wellcome Sanger Institute), Yujun Zhang(Wellcome Sanger Institute), Jeffrey C. Barrett(Wellcome Sanger Institute), Aarno Palotie(Wellcome Sanger Institute), Matt E. Hurles(Wellcome Sanger Institute), Harold P. Swerdlow(Wellcome Sanger Institute), Carol Scott(Wellcome Sanger Institute), John Burton(Wellcome Sanger Institute), Chris Tyler-Smith(Wellcome Sanger Institute), Sarah Lindsay(Wellcome Sanger Institute), Yali Xue(Wellcome Sanger Institute), Daniel Turner(Wellcome Sanger Institute), Klaudia Walter(Wellcome Sanger Institute), Donald F. Conrad(Wellcome Sanger Institute), Richard M. Durbin(Wellcome Sanger Institute), Quan Long(Wellcome Sanger Institute), Thomas M. Keane(Wellcome Sanger Institute), Tom Skelly(Wellcome Sanger Institute), Daniel G. MacArthur(Wellcome Sanger Institute), Luke Jostins(Wellcome Sanger Institute), Senduran Balasubramaniam(Wellcome Sanger Institute), Carol Churcher(Wellcome Sanger Institute), Ni Huang(Wellcome Sanger Institute), Anthony Cox(Wellcome Sanger Institute), Michael Quail(Wellcome Sanger Institute), David M. Carter(Wellcome Sanger Institute), Qasim Ayub(Wellcome Sanger Institute), Alison Coffey(Wellcome Sanger Institute), Petr Danecek(Wellcome Sanger Institute), Pardis C. Sabeti(Broad Institute), Andrew M. Kernytsky(Broad Institute), Heng Li(Broad Institute), Ilya A. Shlyakhter(Broad Institute), Carrie L. Sougnez(Broad Institute), Aaron McKenna(Broad Institute), Stacey B. Gabriel(Broad Institute), Jane Wilkinson (Broad Institute), David B. Jaffe(Broad Institute), Eric Banks(Broad Institute), Kiran V. Garimella(Broad Institute), David Altshuler(Broad Institute), Eric S. Lander(Broad Institute), Aaron D. Ball(Broad Institute), Toby Bloom(Broad Institute), Ryan E. Poplin(Broad Institute), Lauren Ambrogio(Broad Institute), Manuel A. Rivas(Broad Institute), Steven A. McCarroll(Broad Institute), Joshua M. Korn(Broad Institute), Stephen F. Schaffner(Broad Institute), Mark A. DePristo(Broad Institute), Tim J. Fennell(Broad Institute), Robert E. Handsaker(Broad Institute), James C. Nemesh(Broad Institute), Chris Hartl(Broad Institute), Sharon R. Grossman(Broad Institute), Jared R. Maguire(Broad Institute), Anthony A. Philippakis(Broad Institute), Erica Shefler(Broad Institute), Kristian Cibulskis(Broad Institute), Stacey B. Gabriel(Broad Institute), Matt Hanna(Broad Institute), David Altshuler(Broad Institute), David Altshuler (Broad Institute), Steven A. McCarroll (Broad Institute), David Altshuler(Broad Institute), David Altshuler(Broad Institute), Jun Ding(Statistical Service), Wei Chen(Statistical Service), Paul Anderson(Statistical Service), Matthew Snyder(Statistical Service), Gonçalo R. Abecasis(Statistical Service), Hyun Min Kang(Statistical Service), Tom Blackwell(Statistical Service), Carlo Sidore(Statistical Service), Sebastian Zöllner(Statistical Service), Gonçalo R. Abecasis (Statistical Service), Xiaowei Zhan(Statistical Service), Scott Kahn(Illumina (United Kingdom)), Terena James(Illumina (United Kingdom)), Niall Gormley(Illumina (United Kingdom)), R. Keira Cheetham(Illumina (United Kingdom)), Paula Kokko-Gonzales(Illumina (United Kingdom)), David R. Bentley(Illumina (United Kingdom)), Tony Cox(Illumina (United Kingdom)), Scott Kahn (Illumina (United Kingdom)), Zoya Kingsbury(Illumina (United Kingdom)), Lisa Murray(Illumina (United Kingdom)), Jennifer Stone(Illumina (United Kingdom)), Sean Humphray(Illumina (United Kingdom)), Michael Eberle(Illumina (United Kingdom)), Aravinda Chakravarti (Johns Hopkins University), Aravinda Chakravarti(Johns Hopkins University), Andrew G. Clark (Cornell University), Jeremiah Degenhardt(Cornell University), Francis S. Collins(National Institutes of Health), Francis S. Collins (National Institutes of Health), Yongming A. Sun , Fiona C. L. Hyland, Onur Sakarya, Yongming A. Sun, Francisco M. De La Vega, Gil A. McVean(Centre for Human Genetics), Adam Auton(Centre for Human Genetics), Peter Donnelly(Centre for Human Genetics), Gerton Lunter(Centre for Human Genetics), Jonathan L. Marchini(Centre for Human Genetics), Zamin Iqbal(Centre for Human Genetics), Gil A. McVean (Centre for Human Genetics), Simon Myers(Centre for Human Genetics), Michael Egholm(Pall Corporation (United States)), Rasko Leinonen(European Bioinformatics Institute), Fiona Cunningham(European Bioinformatics Institute), William M. McLaren(European Bioinformatics Institute), Xiangqun Zheng-Bradley(European Bioinformatics Institute), Paul Flicek(European Bioinformatics Institute), Richard E. Smith(European Bioinformatics Institute), Stephen Keenen(European Bioinformatics Institute), Eugene Kulesha(European Bioinformatics Institute), Javier Herrero(European Bioinformatics Institute), Vadim Zalunin(European Bioinformatics Institute), Richard E. Smith (European Bioinformatics Institute), Rajesh Radhakrishnan(European Bioinformatics Institute), Fuli Yu(Baylor College of Medicine), Huyen Dinh(Baylor College of Medicine), Richard A. Gibbs(Baylor College of Medicine), David Wheeler(Baylor College of Medicine), Mike Metzker(Baylor College of Medicine), Aniko Sabo(Baylor College of Medicine), Richard A. Gibbs (Baylor College of Medicine), Richard A. Gibbs(Baylor College of Medicine), Jin Yu(Baylor College of Medicine), Cristian Coafra(Baylor College of Medicine), Jeff Reid(Baylor College of Medicine), Danny Challis(Baylor College of Medicine), Christie Kovar(Baylor College of Medicine), Sandy Lee(Baylor College of Medicine), Lynne Nazareth(Baylor College of Medicine), Donna Muzny(Baylor College of Medicine), Matthew Bainbridge(Baylor College of Medicine), David Deiros(Baylor College of Medicine), Bartha M. Knoppers(McGill University), Tatiana A. Borodina(Max Planck Institute for Molecular Genetics), Marius Tolzmann(Max Planck Institute for Molecular Genetics), Dimitri V. Parkhomchuk(Max Planck Institute for Molecular Genetics), Vyacheslav S. Amstislavskiy(Max Planck Institute for Molecular Genetics), Marcus W. Albrecht(Max Planck Institute for Molecular Genetics), Florian Mertes(Max Planck Institute for Molecular Genetics), Wilfiried Nietfeld(Max Planck Institute for Molecular Genetics), Bernd Timmermann(Max Planck Institute for Molecular Genetics), Alexey N. Davydov(Max Planck Institute for Molecular Genetics), Aleksey V. Soldatov(Max Planck Institute for Molecular Genetics), Hans Lehrach(Max Planck Institute for Molecular Genetics), Ralf Herwig (Max Planck Institute for Molecular Genetics), Peter Marquardt(Max Planck Institute for Molecular Genetics), George Weinstock(Washington University in St. Louis), Michael C. Wendl(Washington University in St. Louis), Qunyuan Zhang(Washington University in St. Louis), Robert Fulton (Washington University in St. Louis), Lucinda Fulton(Washington University in St. Louis), Daniel C. Koboldt(Washington University in St. Louis), Ken Chen(Washington University in St. Louis), Mike D. McLellan(Washington University in St. Louis), David Dooling(Washington University in St. Louis), Robert Fulton(Washington University in St. Louis), Li Ding(Washington University in St. Louis), John W. Wallis(Washington University in St. Louis), Asif Chinwalla(Washington University in St. Louis), Richard K. Wilson(Washington University in St. Louis), Elaine R. Mardis(Washington University in St. Louis), Gil A. McVean(Centre for Human Genetics), Loukas Moutsianas(University of Oxford), Afidalina Tumian (University of Oxford), Jonathan L. Marchini(Centre for Human Genetics), Simon Myers(Centre for Human Genetics), Gozde Aksay(University of Washington), Gozde Aksay (University of Washington), Jeffrey M. Kidd(University of Washington), Deborah A. Nickerson(University of Washington), Audrey Duncanson (Wellcome Trust), Alan J. Schafer(Wellcome Trust), Alan J. Schafer(Wellcome Trust), Hoda M. Khouri(National Institutes of Health), Kirill E. Rotmistrovsky(National Institutes of Health), Chunlin Xiao(National Institutes of Health), Robert D. Sanders(National Institutes of Health), Lon D. Phan(National Institutes of Health), Richa Agarwala(National Institutes of Health), Martin F. Shumway (National Institutes of Health), Justin E. Paschall(National Institutes of Health), Martin F. Shumway(National Institutes of Health), Aleksandr O. Morgulis(National Institutes of Health), Stephen T. Sherry(National Institutes of Health), Jian Wang(BGI Group (China)), Yan Zhou(BGI Group (China)), Geng Tian(BGI Group (China)), Yeyang Su(BGI Group (China)), Wei Wang(BGI Group (China)), Hancheng Zheng(BGI Group (China)), Ruiqiang Li(BGI Group (China)), Huanming Yang(BGI Group (China)), Guoqing Li(BGI Group (China)), Ruibang Luo(BGI Group (China)), Min Jian(BGI Group (China)), Shuaishuai Tai(BGI Group (China)), Xiaole Zheng(BGI Group (China)), Yingrui Li(BGI Group (China)), Xiuqing Zhang(BGI Group (China)), Taosha Li (BGI Group (China)), Xiaosen Guo(BGI Group (China)), Huisong Zheng(BGI Group (China)), Jun Wang(BGI Group (China)), Honglong Wu(BGI Group (China)), Huiqing Liang(BGI Group (China)), Bo Wang(BGI Group (China)), Xiaodong Fang(BGI Group (China)), Jun Wang(BGI Group (China)), Ruiqiang Li (BGI Group (China)), Ruiqiang Li(BGI Group (China)), Jonathan M. Manning, Yutao Fu, Clarence C. Lee, Stephen F. McLaughlin, Eric F. Tsung, Heather E. Peckham, Yutao Fu , Jeffry K. Ichikawa, Gina L. Costa, Andreas Dahl(Technische Universität Dresden), Stefan Schreiber(Christian-Albrechts-Universität zu Kiel), Philip Rosenstiel (Christian-Albrechts-Universität zu Kiel), Amanda Caprio(Enzo Life Sciences (United States)), Andrew Kebbel(Enzo Life Sciences (United States)), Adam Burke(Enzo Life Sciences (United States)), Craig Mealmaker(Enzo Life Sciences (United States)), Faheem Niazi(Enzo Life Sciences (United States)), Kristen Pareja(Enzo Life Sciences (United States)), Jason Affourtit(Enzo Life Sciences (United States)), Wanmin Song(Enzo Life Sciences (United States)), Kalvin Kao(Enzo Life Sciences (United States)), Melissa Bachorski(Enzo Life Sciences (United States)), Louise McDade(Enzo Life Sciences (United States)), Matthew Labrecque(Enzo Life Sciences (United States)), Melissa Minderman(Enzo Life Sciences (United States)), Eli Buglione(Enzo Life Sciences (United States)), James Knight(Enzo Life Sciences (United States)), David Riches(Enzo Life Sciences (United States)), Said Attiya(Enzo Life Sciences (United States)), Shally Wang(Enzo Life Sciences (United States)), Christopher Celone(Enzo Life Sciences (United States)), Ravi Ramenani(Enzo Life Sciences (United States)), Shauna Clark(Enzo Life Sciences (United States)), Lisa Gu(Enzo Life Sciences (United States)), Anne Nawrocki(Enzo Life Sciences (United States)), Dana Ashworth(Enzo Life Sciences (United States)), Lorri Guccione(Enzo Life Sciences (United States)), David Conners(Enzo Life Sciences (United States)), Roger Winer (Enzo Life Sciences (United States)), Cynthia Turcotte(Enzo Life Sciences (United States)), Jennifer Knowlton(Enzo Life Sciences (United States)), Brian Desany(Enzo Life Sciences (United States)), Aarno Palotie(Wellcome Sanger Institute), Aarno Palotie (Wellcome Sanger Institute), Anniek De Witte (Agilent Technologies (United States)), Shane Giles(Agilent Technologies (United States)), Erik P. Garrison(Boston College), Jiantao Wu(Boston College), Alistair N. Ward (Boston College), Deniz Kural(Boston College), Amit Indap(Boston College), Wen Fung Leong(Boston College), Alistair N. Ward(Boston College), Wan-Ping Lee(Boston College), Chip Stewart(Boston College), Weichun Huang(National Institute of Environmental Health Sciences), Aaron R. Quinlan (University of Virginia), Aaron R. Quinlan(University of Virginia), Aaron R. Quinlan(University of Virginia), Michael P. Stromberg(Illumina (United States)), Michael P. Stromberg (Illumina (United States)), Ryan E. Mills (Brigham and Women's Hospital), David Altshuler(Broad Institute), Ryan E. Mills(Brigham and Women's Hospital), Xinghua Shi(Brigham and Women's Hospital), Brian L. Browning(University of Washington Medical Center), Pardis C. Sabeti (Broad Institute), Sharon R. Grossman(Broad Institute), Ilya A. Shlyakhter(Broad Institute), Alkes Price(Harvard University), Matthew Mort(Cardiff University), Peter D. Stenson(Cardiff University), Andrew D. Phillips (Cardiff University), Edward V. Ball(Cardiff University), Vladimir Makarov(Icahn School of Medicine at Mount Sinai), Seungtai C. Yoon(Icahn School of Medicine at Mount Sinai), Vladimir Makarov (Icahn School of Medicine at Mount Sinai), Kenny Ye(Albert Einstein College of Medicine), Kenny Ye (Albert Einstein College of Medicine), Hugo Y. K. Lam(Stanford University), Jeffrey M. Kidd(University of Washington), Fabian Grubert(Stanford University), Simon Gravel (Stanford University), Mark Kaganovich(Stanford University), Alexander E. Urban(Stanford Medicine), Adrian M. Stütz (European Molecular Biology Laboratory), Jan O. Korbel(European Molecular Biology Laboratory), Kai Ye (Albert Einstein College of Medicine), Kai Ye(Albert Einstein College of Medicine), Miriam K. Konkel (Louisiana State University), Jerilyn A. Walker(Louisiana State University), Miriam K. Konkel(Louisiana State University), John V. Pearson(Translational Genomics Research Institute), Waibhav D. Tembe(Translational Genomics Research Institute), Steve M. Beckstrom-Sternberg(Translational Genomics Research Institute), Alexis Christoforides(Translational Genomics Research Institute), Ahmet A. Kurdoglu(Translational Genomics Research Institute), Shripad A. Sinari (Translational Genomics Research Institute), Sol J. Katzman(University of California, Santa Cruz), Robert M. Kuhn (University of California, Santa Cruz), Angie S. Hinrichs(University of California, Santa Cruz), Andrew Kern(University of California, Santa Cruz), Molly Przeworski(Howard Hughes Medical Institute), Ryan D. Hernandez(University of California, San Francisco), Joanna L. Kelley (University of Chicago), S. Cord Melton(University of Chicago), Bryan Howie(University of Chicago), William O. Cookson (Lung Institute), Miriam F. Moffatt(Lung Institute), Mark Lathrop(Centre National de Recherche en Génomique Humaine), Liming Liang(Harvard University), Paul Scheet(The University of Texas MD Anderson Cancer Center), Jonathan Keebler(Centre Hospitalier de l’Université de Montréal), Eric A. Stone (Centre Hospitalier de l’Université de Montréal), Ferran Casals(Centre Hospitalier de l’Université de Montréal), Youssef Idaghdour(Centre Hospitalier de l’Université de Montréal), Martine Zilversmit(Centre Hospitalier de l’Université de Montréal), Jinchuan Xing (University of Utah), Lynn Jorde(University of Utah), Jinchuan Xing(University of Utah), Can Alkan(Howard Hughes Medical Institute), Evan E. Eichler(Howard Hughes Medical Institute), Can Alkan (Howard Hughes Medical Institute), Fereydoun Hormozdiari(Simon Fraser University), Iman Hajirasouliha (Simon Fraser University), Iman Hajirasouliha(Simon Fraser University), Cornelis A. Albers(National Health Service), Stephen B. Montgomery(University of Geneva), Emmanouil T. Dermitzakis (University of Geneva), Hanjun Jin(Korea National Institute of Health), Alexej Abyzov(Yale University), Xinmeng Jasmine Mu(Yale University), Jing Leng (Yale University), Lukas Habegger(Yale University), Jing Leng(Yale University), Rajini Haraksingh(Yale University), Justin Jee(Yale University), Jiang Du(Yale University), Robert Bjornson(Yale University), Jiang Du (Yale University), Zhengdong Zhang(Yale University), Zhengdong Zhang (Yale University), Ekta Khurana(Yale University), Suganthi Balasubramanian(Yale University), Suganthi Balasubramanian(Yale University), Alexander E. Urban (Stanford Medicine), Alexander E. Urban(Stanford Medicine), Lorraine H. Toji(Coriell Institute For Medical Research), Neda Gharani (Coriell Institute For Medical Research), Jane S. Kaye(University of Oxford), Alastair Kent(Genetic Alliance), Amy L. McGuire(Baylor College of Medicine), Pilar N. Ossorio(University of Wisconsin–Madison), Charles N. Rotimi(National Institutes of Health), Mark S. Guyer(National Institutes of Health), Lisa D. Brooks(National Institutes of Health), Nicholas C. Clemm(National Institutes of Health), Lisa D. Brooks(National Institutes of Health), Jane L. Peterson (National Institutes of Health), Adam L. Felsenfeld(National Institutes of Health), Jean E. McEwen(National Institutes of Health), Assya Abdallah(George Washington University), Christopher R. Juenger(United States Food and Drug Administration), Eric D. Green(National Institutes of Health), Reed A. Cartwright(Rice University)
The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother–father–child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10−8 per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research. This issue of Nature contains the first publication from The 1000 Genomes Project, an international collaboration that will produce an extensive public catalogue of human genetic variation. The plan, in fact, is to sequence about 2,000 unidentified individuals from 20 populations around the world. This first paper presents the results from the project's pilot phase, testing three different strategies for genome-wide sequencing with high-throughput platforms: low-coverage whole-genome sequencing of 179 individuals in three population groups, high-coverage sequencing of two mother–father–child trios, and exon-targeted sequencing of 697 individuals from seven populations. The goal of the 1000 Genomes Project is to provide in-depth information on variation in human genome sequences. In the pilot phase reported here, different strategies for genome-wide sequencing, using high-throughput sequencing platforms, were developed and compared. The resulting data set includes more than 95% of the currently accessible variants found in any individual, and can be used to inform association and functional studies.