Deep learning of the tissue-regulated splicing code

Michael K. K. Leung(Canadian Institute for Advanced Research), Hui Xiong(Canadian Institute for Advanced Research), Leo J. Lee(Canadian Institute for Advanced Research), Brendan J. Frey(Canadian Institute for Advanced Research)
Bioinformatics
June 11, 2014
Cited by 485Open Access
Full Text

Abstract

MOTIVATION: Alternative splicing (AS) is a regulated process that directs the generation of different transcripts from single genes. A computational model that can accurately predict splicing patterns based on genomic features and cellular context is highly desirable, both in understanding this widespread phenomenon, and in exploring the effects of genetic variations on AS. METHODS: Using a deep neural network, we developed a model inferred from mouse RNA-Seq data that can predict splicing patterns in individual tissues and differences in splicing patterns across tissues. Our architecture uses hidden variables that jointly represent features in genomic sequences and tissue types when making predictions. A graphics processing unit was used to greatly reduce the training time of our models with millions of parameters. RESULTS: We show that the deep architecture surpasses the performance of the previous Bayesian method for predicting AS patterns. With the proper optimization procedure and selection of hyperparameters, we demonstrate that deep architectures can be beneficial, even with a moderately sparse dataset. An analysis of what the model has learned in terms of the genomic features is presented.


Related Papers