Deep RNA sequencing of L. monocytogenes reveals overlapping and extensive stationary phase and sigma B-dependent transcriptomes, including multiple highly transcribed noncoding RNAsBACKGROUND: Identification of specific genes and gene expression patterns important for bacterial survival, transmission and pathogenesis is critically needed to enable development of more effective pathogen control strategies. The stationary phase stress response transcriptome, including many sigmaB-dependent genes, was defined for the human bacterial pathogen Listeria monocytogenes using RNA sequencing (RNA-Seq) with the Illumina Genome Analyzer. Specifically, bacterial transcriptomes were compared between stationary phase cells of L. monocytogenes 10403S and an otherwise isogenic DeltasigB mutant, which does not express the alternative sigma factor sigmaB, a major regulator of genes contributing to stress response, including stresses encountered upon entry into stationary phase. RESULTS: Overall, 83% of all L. monocytogenes genes were transcribed in stationary phase cells; 42% of currently annotated L. monocytogenes genes showed medium to high transcript levels under these conditions. A total of 96 genes had significantly higher transcript levels in 10403S than in DeltasigB, indicating sigmaB-dependent transcription of these genes. RNA-Seq analyses indicate that a total of 67 noncoding RNA molecules (ncRNAs) are transcribed in stationary phase L. monocytogenes, including 7 previously unrecognized putative ncRNAs. Application of a dynamically trained Hidden Markov Model, in combination with RNA-Seq data, identified 65 putative sigmaB promoters upstream of 82 of the 96 sigmaB-dependent genes and upstream of the one sigmaB-dependent ncRNA. The RNA-Seq data also enabled annotation of putative operons as well as visualization of 5'- and 3'-UTR regions. CONCLUSIONS: The results from these studies provide powerful evidence that RNA-Seq data combined with appropriate bioinformatics tools allow quantitative characterization of prokaryotic transcriptomes, thus providing exciting new strategies for exploring transcriptional regulatory networks in bacteria.
Target-Decoy Approach and False Discovery Rate: When Things May Go WrongNitin Gupta, Nuno Bandeira, Uri Keich et al.|Journal of the American Society for Mass Spectrometry|2011 The target-decoy approach (TDA) has done the field of proteomics a great service by filling in the need to estimate the false discovery rates (FDR) of peptide identifications. While TDA is often viewed as a universal solution to the problem of FDR evaluation, we argue that the time has come to critically re-examine TDA and to acknowledge not only its merits but also its demerits. We demonstrate that some popular MS/MS search tools are not TDA-compliant and that it is easy to develop a non-TDA compliant tool that outperforms all TDA-compliant tools. Since the distinction between TDA-compliant and non-TDA compliant tools remains elusive, we are concerned about a possible proliferation of non-TDA-compliant tools in the future (developed with the best intentions). We are also concerned that estimation of the FDR by TDA awkwardly depends on a virtual coin toss and argue that it is important to take the coin toss factor out of our estimation of the FDR. Since computing FDR via TDA suffers from various restrictions, we argue that TDA is not needed when accurate p-values of individual Peptide-Spectrum Matches are available.
Finding motifs in the twilight zoneMOTIVATION: Gene activity is often affected by binding transcription factors to short fragments in DNA sequences called motifs. Identification of subtle regulatory motifs in a DNA sequence is a difficult pattern recognition problem. In this paper we design a new motif finding algorithm that can detect very subtle motifs. RESULTS: We introduce the notion of a multiprofile and use it for finding subtle motifs in DNA sequences. Multiprofiles generalize the notion of a profile and allow one to detect subtle patterns that escape detection by the standard profiles. Our MULTIPROFILER algorithm outperforms other leading motif finding algorithms in a number of synthetic models. Moreover, it can be shown that in some previously studied motif models, MULTIPROFILER is capable of pushing the performance envelope to its theoretical limits. AVAILABILITY: http://www-cse.ucsd.edu/groups/bioinformatics/software.html