An estimate of the total number of true human miRNAsWhile the number of human miRNA candidates continuously increases, only a few of them are completely characterized and experimentally validated. Toward determining the total number of true miRNAs, we employed a combined in silico high- and experimental low-throughput validation strategy. We collected 28 866 human small RNA sequencing data sets containing 363.7 billion sequencing reads and excluded falsely annotated and low quality data. Our high-throughput analysis identified 65% of 24 127 mature miRNA candidates as likely false-positives. Using northern blotting, we experimentally validated miRBase entries and novel miRNA candidates. By exogenous overexpression of 108 precursors that encode 205 mature miRNAs, we confirmed 68.5% of the miRBase entries with the confirmation rate going up to 94.4% for the high-confidence entries and 18.3% of the novel miRNA candidates. Analyzing endogenous miRNAs, we verified the expression of 8 miRNAs in 12 different human cell lines. In total, we extrapolated 2300 true human mature miRNAs, 1115 of which are currently annotated in miRBase V22. The experimentally validated miRNAs will contribute to revising targetomes hypothesized by utilizing falsely annotated miRNAs.
Multiple Sclerosis: MicroRNA Expression Profiles Accurately Differentiate Patients with Relapsing-Remitting Disease from Healthy ControlsMultiple sclerosis (MS) is a chronic inflammatory demyelinating disease of the central nervous system, which is heterogenous with respect to clinical manifestations and response to therapy. Identification of biomarkers appears desirable for an improved diagnosis of MS as well as for monitoring of disease activity and treatment response. MicroRNAs (miRNAs) are short non-coding RNAs, which have been shown to have the potential to serve as biomarkers for different human diseases, most notably cancer. Here, we analyzed the expression profiles of 866 human miRNAs. In detail, we investigated the miRNA expression in blood cells of 20 patients with relapsing-remitting MS (RRMS) and 19 healthy controls using a human miRNA microarray and the Geniom Real Time Analyzer (GRTA) platform. We identified 165 miRNAs that were significantly up- or downregulated in patients with RRMS as compared to healthy controls. The best single miRNA marker, hsa-miR-145, allowed discriminating MS from controls with a specificity of 89.5%, a sensitivity of 90.0%, and an accuracy of 89.7%. A set of 48 miRNAs that was evaluated by radial basis function kernel support vector machines and 10-fold cross validation yielded a specificity of 95%, a sensitivity of 97.6%, and an accuracy of 96.3%. While 43 of the 165 miRNAs deregulated in patients with MS have previously been related to other human diseases, the remaining 122 miRNAs are so far exclusively associated with MS. The implications of our study are twofold. The miRNA expression profiles in blood cells may serve as a biomarker for MS, and deregulation of miRNA expression may play a role in the pathogenesis of MS.
GeneTrail--advanced gene set enrichment analysisWe present a comprehensive and efficient gene set analysis tool, called 'GeneTrail' that offers a rich functionality and is easy to use. Our web-based application facilitates the statistical evaluation of high-throughput genomic or proteomic data sets with respect to enrichment of functional categories. GeneTrail covers a wide variety of biological categories and pathways, among others KEGG, TRANSPATH, TRANSFAC, and GO. Our web server provides two common statistical approaches, 'Over-Representation Analysis' (ORA) comparing a reference set of genes to a test set, and 'Gene Set Enrichment Analysis' (GSEA) scoring sorted lists of genes. Besides other newly developed features, GeneTrail's statistics module includes a novel dynamic-programming algorithm that improves the P-value computation of GSEA methods considerably. GeneTrail is freely accessible at http://genetrail.bioinf.uni-sb.de.
A new Lamarckian genetic algorithm for flexible ligand‐receptor dockingJan Fuhrmann, Alexander Rurainski, Hans‐Peter Lenhof et al.|Journal of Computational Chemistry|2010 We present a Lamarckian genetic algorithm (LGA) variant for flexible ligand-receptor docking which allows to handle a large number of degrees of freedom. Our hybrid method combines a multi-deme LGA with a recently published gradient-based method for local optimization of molecular complexes. We compared the performance of our new hybrid method to two non gradient-based search heuristics on the Astex diverse set for flexible ligand-receptor docking. Our results show that the novel approach is clearly superior to other LGAs employing a stochastic optimization method. The new algorithm features a shorter run time and gives substantially better results, especially with increasing complexity of the ligands. Thus, it may be used to dock ligands with many rotatable bonds with high efficiency.
An integer linear programming approach for finding deregulated subgraphs in regulatory networksDeregulation of cell signaling pathways plays a crucial role in the development of tumors. The identification of such pathways requires effective analysis tools that facilitate the interpretation of expression differences. Here, we present a novel and highly efficient method for identifying deregulated subnetworks in a regulatory network. Given a score for each node that measures the degree of deregulation of the corresponding gene or protein, the algorithm computes the heaviest connected subnetwork of a specified size reachable from a designated root node. This root node can be interpreted as a molecular key player responsible for the observed deregulation. To demonstrate the potential of our approach, we analyzed three gene expression data sets. In one scenario, we compared expression profiles of non-malignant primary mammary epithelial cells derived from BRCA1 mutation carriers and of epithelial cells without BRCA1 mutation. Our results suggest that oxidative stress plays an important role in epithelial cells of BRCA1 mutation carriers and that the activation of stress proteins may result in avoidance of apoptosis leading to an increased overall survival of cells with genetic alterations. In summary, our approach opens new avenues for the elucidation of pathogenic mechanisms and for the detection of molecular key players.