Analysis of protein-coding genetic variation in 60,706 humans

Monkol Lek(Broad Institute), Konrad J. Karczewski(Broad Institute), Eric Vallabh Minikel(Broad Institute), Kaitlin E. Samocha(Broad Institute), Eric Banks(Broad Institute), Timothy R. Fennell(Broad Institute), Anne O’Donnell‐Luria(Massachusetts General Hospital), James S. Ware(Broad Institute), Andrew Hill(Broad Institute), Beryl B. Cummings(Broad Institute), Taru Tukiainen(Broad Institute), Daniel P. Birnbaum(Broad Institute), Jack A. Kosmicki(Broad Institute), Laramie E. Duncan(Broad Institute), Karol Estrada(Broad Institute), Fengmei Zhao(Broad Institute), James Zou(Broad Institute), Emma Pierce‐Hoffman(Broad Institute), Joanne Berghout(University of Arizona), D.N. Cooper(Cardiff University), Nicole Deflaux(Google (United States)), Mark A. DePristo(Broad Institute), Ron Do(Child Health and Development Institute), Jason Flannick(Broad Institute), Menachem Fromer(Broad Institute), Laura D. Gauthier(Broad Institute), Jackie Goldstein(Broad Institute), Namrata Gupta(Broad Institute), Daniel P. Howrigan(Broad Institute), Adam Kieżun(Broad Institute), Mitja Kurki(Broad Institute), Ami Levy Moonshine(Broad Institute), Pradeep Natarajan(Broad Institute), Lorena Orozco(National Institute of Genomic Medicine), Gina M. Peloso(Broad Institute), Ryan Poplin(Broad Institute), Manuel A. Rivas(Broad Institute), Valentín Ruano-Rubio(Broad Institute), Samuel A. Rose(Broad Institute), Douglas M. Ruderfer(Icahn School of Medicine at Mount Sinai), Khalid Shakir(Broad Institute), Peter D. Stenson(Cardiff University), Christine Stevens(Broad Institute), Brett Thomas(Broad Institute), Grace Tiao(Broad Institute), Maria T. Tusie-Luna(Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán), Ben Weisburd(Broad Institute), Hong‐Hee Won(Samsung Medical Center), Dongmei Yu(Broad Institute), David Altshuler(Broad Institute), Diego Ardissino(University of Parma), Michael Boehnke(University of Michigan), John Danesh(Department of Public Health), Stacey Donnelly(Broad Institute), Roberto Elosúa(Hospital Del Mar), José C. Florez(Broad Institute), Stacey Gabriel(Broad Institute), Gad Getz(Broad Institute), Stephen J. Glatt(SUNY Upstate Medical University), Christina M. Hultman(Karolinska Institutet), Sekar Kathiresan(Broad Institute), Markku Laakso(University of Eastern Finland), Steven A. McCarroll(Broad Institute), Mark I. McCarthy(Centre for Human Genetics), Dermot McGovern(Cedars-Sinai Medical Center), Ruth McPherson(University of Ottawa), Benjamin M. Neale(Broad Institute), Aarno Palotie(Broad Institute), Shaun Purcell(Icahn School of Medicine at Mount Sinai), Danish Saleheen(Center for Non-Communicable Diseases), Jeremiah M. Scharf(Broad Institute), Pamela Sklar(Allen Institute for Brain Science), Patrick F. Sullivan(University of North Carolina at Chapel Hill), Jaakko Tuomilehto(University of Helsinki), Ming T. Tsuang(University of California San Diego), Hugh Watkins(Centre for Human Genetics), James G. Wilson(University of Mississippi Medical Center), Mark J. Daly(Broad Institute), Daniel G. MacArthur(Broad Institute)
Nature
August 1, 2016
Cited by 10,294Open Access
Full Text

Abstract

Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.


Related Papers

No related papers found

Powered by citation graph analysis