Searching for missing heritability: Designing rare variant association studies

Or Zuk(Broad Institute), S. F. Schaffner(Broad Institute), Kaitlin E. Samocha(Broad Institute), Ron Do(Broad Institute), Eliana Hechter(Broad Institute), Sekar Kathiresan(Broad Institute), Mark J. Daly(Broad Institute), Benjamin M. Neale(Broad Institute), Shamil Sunyaev(Broad Institute), Eric S. Lander(Broad Institute)
Proceedings of the National Academy of Sciences
January 17, 2014
Cited by 684Open Access
Full Text

Abstract

Genetic studies have revealed thousands of loci predisposing to hundreds of human diseases and traits, revealing important biological pathways and defining novel therapeutic hypotheses. However, the genes discovered to date typically explain less than half of the apparent heritability. Because efforts have largely focused on common genetic variants, one hypothesis is that much of the missing heritability is due to rare genetic variants. Studies of common variants are typically referred to as genomewide association studies, whereas studies of rare variants are often simply called sequencing studies. Because they are actually closely related, we use the terms common variant association study (CVAS) and rare variant association study (RVAS). In this paper, we outline the similarities and differences between RVAS and CVAS and describe a conceptual framework for the design of RVAS. We apply the framework to address key questions about the sample sizes needed to detect association, the relative merits of testing disruptive alleles vs. missense alleles, frequency thresholds for filtering alleles, the value of predictors of the functional impact of missense alleles, the potential utility of isolated populations, the value of gene-set analysis, and the utility of de novo mutations. The optimal design depends critically on the selection coefficient against deleterious alleles and thus varies across genes. The analysis shows that common variant and rare variant studies require similarly large sample collections. In particular, a well-powered RVAS should involve discovery sets with at least 25,000 cases, together with a substantial replication set.


Related Papers

No related papers found

Powered by citation graph analysis