Aberrant splicing prediction across human tissues

Muhammed Hasan Çelik(University of California, Irvine), Nils Wagner(Helmholtz Association of German Research Centres), Florian R. Hölzlwimmer(Technical University of Munich), Vicente A. Yépez(Technical University of Munich), Christian Mertes(Technical University of Munich), Holger Prokisch(Helmholtz Zentrum München), Julien Gagneur(Helmholtz Association of German Research Centres)
bioRxiv (Cold Spring Harbor Laboratory)
June 15, 2022
Cited by 8Open Access
Full Text

Abstract

Aberrant splicing is a major cause of genetic disorders but its direct detection in transcriptomes is limited to clinically accessible tissues such as skin or body fluids. While DNA-based machine learning models allow prioritizing rare variants for affecting splicing, their performance on predicting tissue-specific aberrant splicing remains unassessed. Here, we generated the first aberrant splicing benchmark dataset, spanning over 8.8 million rare variants in 49 human tissues. At 20% recall, state-of-the-art DNA-based models cap at 10% precision. By mapping and quantifying tissue-specific splice site usage transcriptome-wide and modeling isoform competition, we increased precision by three-fold at the same recall. Integrating RNA-sequencing data of clinically accessible tissues brought precision to 60%. These results, replicated in two independent cohorts, substantially contribute to non-coding loss-of-function variant identification and to genetic diagnostics design and analytics.


Related Papers

No related papers found

Powered by citation graph analysis