An encyclopedia of enhancer-gene regulatory interactions in the human genome

Andreas R. Gschwind(Lucile Packard Children's Hospital), Kristy S. Mualim(Carnegie Institution for Science), Alireza Karbalayghareh(Memorial Sloan Kettering Cancer Center), Maya U. Sheth(Lucile Packard Children's Hospital), Kushal K. Dey(Memorial Sloan Kettering Cancer Center), Evelyn Jagoda(Broad Institute), Ramil Nurtdinov(Centre for Genomic Regulation), Xi Wang(Johns Hopkins University), Anthony S. Tan(Lucile Packard Children's Hospital), H. Spencer Jones(Lucile Packard Children's Hospital), X. Rosa(Lucile Packard Children's Hospital), David Yao(Stanford University), Joseph Nasser(Broad Institute), Žiga Avsec(Google DeepMind (United Kingdom)), Benjamin T. James(Broad Institute), Muhammad S. Shamim(Baylor College of Medicine), Neva C. Durand(Broad Institute), Suhas S.P. Rao(University of California, San Francisco), Ragini Mahajan(Baylor College of Medicine), Benjamin R. Doughty(Stanford University), Kalina Andreeva(Stanford University), Jacob C. Ulirsch(Broad Institute), Kaili Fan(Harvard University), Elizabeth M. Perez(Broad Institute), Tri C. Nguyen(Lucile Packard Children's Hospital), David R. Kelley(Enzo Life Sciences (United States)), Hilary K. Finucane(Broad Institute), Jill E. Moore(University of Massachusetts Chan Medical School), Zhiping Weng(University of Massachusetts Chan Medical School), Manolis Kellis(Broad Institute), Michael C. Bassik(Stanford University), Alkes L. Price(Harvard University), M Beer(Johns Hopkins University), Roderic Guigó(Centre for Genomic Regulation), J Stamatoyannopoulos(Fred Hutch Cancer Center), Erez Lieberman Aiden(Broad Institute), William J. Greenleaf(Stanford University), Christina S. Leslie(Memorial Sloan Kettering Cancer Center), Lars M. Steinmetz(European Molecular Biology Laboratory), Anshul Kundaje(Stanford University), J Engreitz(Broad Institute)
bioRxiv (Cold Spring Harbor Laboratory)
November 13, 2023
Cited by 115Open Access
Full Text

Abstract

Abstract Identifying transcriptional enhancers and their target genes is essential for understanding gene regulation and the impact of human genetic variation on disease 1–6 . Here we create and evaluate a resource of >13 million enhancer-gene regulatory interactions across 352 cell types and tissues, by integrating predictive models, measurements of chromatin state and 3D contacts, and large-scale genetic perturbations generated by the ENCODE Consortium 7 . We first create a systematic benchmarking pipeline to compare predictive models, assembling a dataset of 10,411 element-gene pairs measured in CRISPR perturbation experiments, >30,000 fine-mapped eQTLs, and 569 fine-mapped GWAS variants linked to a likely causal gene. Using this framework, we develop a new predictive model, ENCODE-rE2G, that achieves state-of-the-art performance across multiple prediction tasks, demonstrating a strategy involving iterative perturbations and supervised machine learning to build increasingly accurate predictive models of enhancer regulation. Using the ENCODE-rE2G model, we build an encyclopedia of enhancer-gene regulatory interactions in the human genome, which reveals global properties of enhancer networks, identifies differences in the functions of genes that have more or less complex regulatory landscapes, and improves analyses to link noncoding variants to target genes and cell types for common, complex diseases. By interpreting the model, we find evidence that, beyond enhancer activity and 3D enhancer-promoter contacts, additional features guide enhancer-promoter communication including promoter class and enhancer-enhancer synergy. Altogether, these genome-wide maps of enhancer-gene regulatory interactions, benchmarking software, predictive models, and insights about enhancer function provide a valuable resource for future studies of gene regulation and human genetics.


Related Papers

No related papers found

Powered by citation graph analysis