A consensus variant-to-function score to functionally prioritize variants for disease

Tabassum Fabiha(Memorial Sloan Kettering Cancer Center), Ivy Evergreen(Stanford University), Soumya Kundu(Stanford University), Anusri Pampari(Stanford University), Sergey Abramov(Altius Institute for Biomedical Sciences), Alexandr Boytsov(Altius Institute for Biomedical Sciences), Kari Strouse(Duke University), K Dura(Duke University), Weixiang Fang(Johns Hopkins University), Gaspard Kerner(Harvard University), John D. Butts(Jackson Laboratory), Thahmina Ali(Memorial Sloan Kettering Cancer Center), Andreas R. Gschwind(Lucile Packard Children's Hospital), Kristy S. Mualim(Carnegie Institution for Science), Jill E. Moore(University of Massachusetts Chan Medical School), Zhiping Weng(University of Massachusetts Chan Medical School), Jacob C. Ulirsch(Illumina (United States)), Hongkai Ji(Johns Hopkins University), Jeff Vierstra(Altius Institute for Biomedical Sciences), Timothy E. Reddy(Duke University), Stephen B. Montgomery(Stanford University), J Engreitz(Stanford University), Anshul Kundaje(Stanford University), Ryan Tewhey(Jackson Laboratory), Alkes L. Price(Harvard University), Kushal K. Dey(Memorial Sloan Kettering Cancer Center)
bioRxiv (Cold Spring Harbor Laboratory)
November 9, 2024
Cited by 9Open Access
Full Text

Abstract

Identifying and functionally characterizing causal disease variants in genome-wide association studies remains a pressing challenge. Here, we construct a consensus variant-to-function (cV2F) score that assigns a single value to each common single-nucleotide variant in the genome, and helps to predict and characterize causal disease variants. The cV2F score leverages features reflecting variant-level experimentally and computationally predicted function (e.g. allelic imbalance and sequence-based deep learning models) and element-level function (e.g. predicted enhancers), and learns optimal combinations of features by training a gradient boosting model on GWAS fine-mapping results. The cV2F-annotated variants attained an AUPRC of 0.822 at identifying held-out fine-mapped variants. Variants with high cV2F scores are highly enriched for heritability (14.2x, s.e. 0.5) across 66 diseases/traits, are uniquely informative for disease heritability, and are highly predictive of variants implicated by reporter assays; cV2F substantially outperforms previous variant-to-function scores using all of these metrics. GWAS fine-mapping of 110 diseases/traits informed by cV2F identified 14.3% more confidently fine-mapped (PIP > 0.95) variants than non-functionally informed fine-mapping. We further constructed tissue/cell line-specific cV2F scores that prioritize variants based on regulatory potential in specific tissues/cell lines, attaining high heritability enrichment for tissue-related diseases/traits (15.6x, s.e. 2.3) while providing independent information (average correlation of 0.27 with the primary cV2F score). We highlight examples of GWAS loci for which cV2F pinpoints causal variants with high confidence and elucidates their functional role.


Related Papers

No related papers found

Powered by citation graph analysis