MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery

Kai Wang(University of Kentucky), Darshan Singh(University of North Carolina at Chapel Hill), Zheng Zeng(University of North Carolina at Chapel Hill), S.J. Coleman(University of North Carolina at Chapel Hill), Yan Huang(University of North Carolina at Chapel Hill), Gleb L. Savich(University of North Carolina at Chapel Hill), Xiaping He(University of North Carolina at Chapel Hill), Piotr A. Mieczkowski(University of North Carolina at Chapel Hill), Sara A. Grimm(University of North Carolina at Chapel Hill), Charles M. Perou(University of North Carolina at Chapel Hill), James N. MacLeod(University of North Carolina at Chapel Hill), Derek Y. Chiang(University of North Carolina at Chapel Hill), Jan F. Prins(University of North Carolina at Chapel Hill), Jinze Liu(University of North Carolina at Chapel Hill)
Nucleic Acids Research
August 28, 2010
Cited by 1,245Open Access
Full Text

Abstract

The accurate mapping of reads that span splice junctions is a critical component of all analytic techniques that work with RNA-seq data. We introduce a second generation splice detection algorithm, MapSplice, whose focus is high sensitivity and specificity in the detection of splices as well as CPU and memory efficiency. MapSplice can be applied to both short (<75 bp) and long reads (≥ 75 bp). MapSplice is not dependent on splice site features or intron length, consequently it can detect novel canonical as well as non-canonical splices. MapSplice leverages the quality and diversity of read alignments of a given splice to increase accuracy. We demonstrate that MapSplice achieves higher sensitivity and specificity than TopHat and SpliceMap on a set of simulated RNA-seq data. Experimental studies also support the accuracy of the algorithm. Splice junctions derived from eight breast cancer RNA-seq datasets recapitulated the extensiveness of alternative splicing on a global level as well as the differences between molecular subtypes of breast cancer. These combined results indicate that MapSplice is a highly accurate algorithm for the alignment of RNA-seq reads to splice junctions. Software download URL: http://www.netlab.uky.edu/p/bioinfo/MapSplice.


Related Papers