Quality Control Procedures for Genome‐Wide Association Studies

Stephen Turner; Loren L. Armstrong; Yuki Bradford; Christopher S. Carlson; Dana C. Crawford; Andrew Crenshaw; Mariza de Andrade; Kimberly F. Doheny; Jonathan L. Haines; Geoffrey Hayes; Gail P. Jarvik; Lan Jiang; Iftikhar J. Kullo; Rongling Li; Hua Ling; Teri A. Manolio; Martha Matsumoto; Catherine A. McCarty; Andrew McDavid; Daniel B. Mirel; Justin Paschall; Elizabeth Pugh; Luke V. Rasmussen; Russell A. Wilke; Rebecca L. Zuvich; Marylyn D. Ritchie

doi:10.1002/0471142905.hg0119s68

Quality Control Procedures for Genome‐Wide Association Studies

Stephen Turner(Vanderbilt University), Loren L. Armstrong(Northwestern University), Yuki Bradford(Vanderbilt University), Christopher S. Carlson(Fred Hutch Cancer Center), Dana C. Crawford(Vanderbilt University), Andrew Crenshaw(Broad Institute), Mariza de Andrade(Mayo Clinic), Kimberly F. Doheny(Johns Hopkins University), Jonathan L. Haines(Vanderbilt University), Geoffrey Hayes(Northwestern University), Gail P. Jarvik(University of Washington), Lan Jiang(Vanderbilt University), Iftikhar J. Kullo(Mayo Clinic), Rongling Li(National Institutes of Health), Hua Ling(Johns Hopkins University), Teri A. Manolio(National Institutes of Health), Martha Matsumoto(Mayo Clinic), Catherine A. McCarty(Marshfield Clinic), Andrew McDavid(Fred Hutch Cancer Center), Daniel B. Mirel(Broad Institute), Justin Paschall(National Institutes of Health), Elizabeth Pugh(Johns Hopkins University), Luke V. Rasmussen(Marshfield Clinic), Russell A. Wilke(Vanderbilt University), Rebecca L. Zuvich(Vanderbilt University), Marylyn D. Ritchie(Vanderbilt University)

Current Protocols in Human Genetics

January 1, 2011

10.1002/0471142905.hg0119s68

Cited by 378

Abstract

Genome-wide association studies (GWAS) are being conducted at an unprecedented rate in population-based cohorts and have increased our understanding of the pathophysiology of complex disease. Regardless of context, the practical utility of this information will ultimately depend upon the quality of the original data. Quality control (QC) procedures for GWAS are computationally intensive, operationally challenging, and constantly evolving. Here we enumerate some of the challenges in QC of GWAS data and describe the approaches that the electronic MEdical Records and Genomics (eMERGE) network is using for quality assurance in GWAS data, thereby minimizing potential bias and error in GWAS results. We discuss common issues associated with QC of GWAS data, including data file formats, software packages for data manipulation and analysis, sex chromosome anomalies, sample identity, sample relatedness, population substructure, batch effects, and marker quality. We propose best practices and discuss areas of ongoing and future research.

Related Papers

The International HapMap Project

Richard A. Gibbs, John W. Belmont, Paul Hardenbol et al.|Nature|2003|6.2k

FAST‐TRACK: Integrating QTL mapping and genome scans towards the characterization of candidate loci under parallel selection in the lake whitefish (<i>Coregonus clupeaformis</i>)

Sean M. Rogers, Louis Bernatchez|Molecular Ecology|2004|5.1k

Genomic Control for Association Studies

Bernie Devlin, Kathryn Roeder|Biometrics|1999|3.2k

<i>SLCO1B1</i> Variants and Statin-Induced Myopathy — A Genomewide Study

The SEARCH Collaborative Group|New England Journal of Medicine|2008|2k

Replicating genotype–phenotype associations

Stephen J. Chanock, Teri A. Manolio, Michael Boehnke et al.|Nature|2007|1.4k