Predicting expression-altering promoter mutations with deep learning

Kishore Jaganathan(Illumina (United States)), Nicole M. Ferraro(Illumina (United States)), Gherman Novakovsky(Illumina (United States)), Yuchuan Wang(Illumina (United States)), Terena James(Illumina (United States)), Jeremy Schwartzentruber(Illumina (United States)), Petko Fiziev(Illumina (United States)), Irfahan Kassam(Illumina (United States)), Fan Cao(Illumina (United States)), Johann S. Hawe(Illumina (United States)), Henry Cavanagh(Illumina (United States)), Ashley J. W. Lim(Illumina (United States)), Grace Png(Illumina (United States)), Jeremy F. McRae(Illumina (United States)), Abhimanyu Banerjee(Illumina (United States)), Arvind Kumar(Illumina (United States)), Jacob C. Ulirsch(Illumina (United States)), Yan Zhang(Illumina (United States)), François Aguet(Illumina (United States)), Pierrick Wainschtein(Illumina (United States)), Laksshman Sundaram(Illumina (United States)), Adriana Salcedo(Illumina (United States)), Sofia Kyriazopoulou Panagiotopoulou(Illumina (United States)), Delasa Aghamirzaie(Illumina (United States)), Evin M. Padhi(Stanford University), Ziming Weng(Stanford University), Shan Dong(University of California, San Francisco), Damian Smedley(Queen Mary University of London), Mark J. Caulfield(Queen Mary University of London), Anne O’Donnell‐Luria(Broad Institute), Heidi L. Rehm(Broad Institute), Stephan Sanders(University of California, San Francisco), Anshul Kundaje(Stanford University), Stephen B. Montgomery(Stanford University), Mark T. Ross(Illumina (United States)), Kyle Kai‐How Farh(Illumina (United States))
Science
May 29, 2025
Cited by 40

Abstract

Only a minority of patients with rare genetic diseases are presently diagnosed by exome sequencing, suggesting that additional unrecognized pathogenic variants may reside in noncoding sequence. In this work, we describe PromoterAI, a deep neural network that accurately identifies noncoding promoter variants that dysregulate gene expression. We show that promoter variants with predicted expression-altering consequences produce outlier expression at both the RNA and protein levels in thousands of individuals and that these variants experience strong negative selection in human populations. We observed that clinically relevant genes in patients with rare diseases are enriched for such variants and validated their functional impact through reporter assays. Our estimates suggest that promoter variation accounts for 6% of the genetic burden associated with rare diseases.


Related Papers

No related papers found

Powered by citation graph analysis