Technical University of Munich
ORCID: 0000-0002-5522-2562Publishes on Genetic Associations and Epidemiology, Genomics and Rare Diseases, Molecular Biology Techniques and Applications. 14 papers and 138 citations.
Add your photo, update your bio, and get notified when your ranking changes.
Rare genetic variants can have strong effects on phenotypes, yet accounting for rare variants in genetic analyses is statistically challenging due to the limited number of allele carriers and the burden of multiple testing. While rich variant annotations promise to enable well-powered rare variant association tests, methods integrating variant annotations in a data-driven manner are lacking. Here we propose deep rare variant association testing (DeepRVAT), a model based on set neural networks that learns a trait-agnostic gene impairment score from rare variant annotations and phenotypes, enabling both gene discovery and trait prediction. On 34 quantitative and 63 binary traits, using whole-exome-sequencing data from UK Biobank, we find that DeepRVAT yields substantial gains in gene discoveries and improved detection of individuals at high genetic risk. Finally, we demonstrate how DeepRVAT enables calibrated and computationally efficient rare variant tests at biobank scale, aiding the discovery of genetic risk factors for human disease traits.
Despite the frequent implication of aberrant gene expression in diseases, algorithms predicting aberrantly expressed genes of an individual are lacking. To address this need, we compile an aberrant expression prediction benchmark covering 8.2 million rare variants from 633 individuals across 49 tissues. While not geared toward aberrant expression, the deleteriousness score CADD and the loss-of-function predictor LOFTEE show mild predictive ability (1-1.6% average precision). Leveraging these and further variant annotations, we next train AbExp, a model that yields 12% average precision by combining in a tissue-specific fashion expression variability with variant effects on isoforms and on aberrant splicing. Integrating expression measurements from clinically accessible tissues leads to another two-fold improvement. Furthermore, we show on UK Biobank blood traits that performing rare variant association testing using the continuous and tissue-specific AbExp variant scores instead of LOFTEE variant burden increases gene discovery sensitivity and enables improved phenotype predictions.
Aberrant splicing is a major cause of genetic disorders but its direct detection in transcriptomes is limited to clinically accessible tissues such as skin or body fluids. While DNA-based machine learning models allow prioritizing rare variants for affecting splicing, their performance on predicting tissue-specific aberrant splicing remains unassessed. Here, we generated the first aberrant splicing benchmark dataset, spanning over 8.8 million rare variants in 49 human tissues. At 20% recall, state-of-the-art DNA-based models cap at 10% precision. By mapping and quantifying tissue-specific splice site usage transcriptome-wide and modeling isoform competition, we increased precision by three-fold at the same recall. Integrating RNA-sequencing data of clinically accessible tissues brought precision to 60%. These results, replicated in two independent cohorts, substantially contribute to non-coding loss-of-function variant identification and to genetic diagnostics design and analytics.
OBJECTIVE: Genomic sequencing leaves >50% of dystonia-affected individuals without a diagnosis. Where DNA-oriented approaches remain insufficient, integrating multiomics is essential to advance genome interpretation. Herein, we incorporated RNA sequencing (RNA-seq) data from 167 patients with dystonia across a range of ages and presentations. METHODS: We leveraged an RNA-seq analysis pipeline, focused on the identification of expression and splicing aberrations, on RNA-seq from skin biopsies. The recruited patients had early-onset dystonia in 85.0%, non-focal dystonia in 92.2%, and coexisting features in 76.0%. Thirty-six patient samples with pre-identified variants (36/167, 21.6%) and 131 samples with no previously prioritized diagnostic candidates from genomic sequencing (131/167, 78.4%) were evaluated. RESULTS: We found that >80% of dystonia-associated genes were detected by fibroblast RNA-seq. Expression and splicing aberration analyses produced a manageable number of significant RNA defects affecting dystonia-associated genes. The approach was especially successful in validating pathogenic effects of loss-of-function variants, with disease-relevant RNA-underexpression detected for 66.7% (10/15). Studying aberrant expression and splicing in the context of other pre-identified variant types yielded relevant results in 28.6% (6/21 samples). We obtained a 6.9% (9/131) diagnostic uplift for patients without prior candidates, all of whom exhibited combined dystonia with autosomal recessive inheritance. The new diagnoses from RNA-seq and genomic reanalysis were based on previously neglected splice-region (3/9) and deep(er) intronic (6/9) variants. For the observed events, integration of new machine-learning scores predicted corresponding aberrant gene expression in the brain. INTERPRETATION: Fibroblast-based RNA-seq in our selected cohort improved variant interpretation and offered a modest yield in patients without prior candidate variants. ANN NEUROL 2026;99:1363-1378.