Homology-driven assembly of NOn-redundant protEin sequence sets (NOmESS) for mass spectrometry

Tikira Temu(Max Planck Institute of Biochemistry), Matthias Mann(Max Planck Institute of Biochemistry), Markus Räschle(Max Planck Institute of Biochemistry), Jürgen Cox
Bioinformatics
January 6, 2015
Cited by 11Open Access
Full Text

Abstract

UNLABELLED: To enable mass spectrometry (MS)-based proteomic studies with poorly characterized organisms, we developed a computational workflow for the homology-driven assembly of a non-redundant reference sequence dataset. In the automated pipeline, translated DNA sequences (e.g. ESTs, RNA deep-sequencing data) are aligned to those of a closely related and fully sequenced organism. Representative sequences are derived from each cluster and joined, resulting in a non-redundant reference set representing the maximal available amino acid sequence information for each protein. We here applied NOmESS to assemble a reference database for the widely used model organism Xenopus laevis and demonstrate its use in proteomic applications. AVAILABILITY AND IMPLEMENTATION: NOmESS is written in C#. The source code as well as the executables can be downloaded from http://www.biochem.mpg.de/cox Execution of NOmESS requires BLASTp and cd-hit in addition. CONTACT: cox@biochem.mpg.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Related Papers

No related papers found

Powered by citation graph analysis