Publishes on Genomics and Phylogenetic Studies, Plant Molecular Biology Research, Plant nutrient uptake and metabolism. 345 papers and 92.9k citations.
MOTIVATION: Although many next-generation sequencing (NGS) read preprocessing tools already existed, we could not find any tool or combination of tools that met our requirements in terms of flexibility, correct handling of paired-end data and high performance. We have developed Trimmomatic as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data. RESULTS: The value of NGS read preprocessing is demonstrated for both reference-based and reference-free tasks. Trimmomatic is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested. AVAILABILITY AND IMPLEMENTATION: Trimmomatic is licensed under GPL V3. It is cross-platform (Java 1.5+ required) and available at http://www.usadellab.org/cms/index.php?page=trimmomatic CONTACT: usadel@bio1.rwth-aachen.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
UNLABELLED: Metabolomics, in particular gas chromatography-mass spectrometry (GC-MS) based metabolite profiling of biological extracts, is rapidly becoming one of the cornerstones of functional genomics and systems biology. Metabolite profiling has profound applications in discovering the mode of action of drugs or herbicides, and in unravelling the effect of altered gene expression on metabolism and organism performance in biotechnological applications. As such the technology needs to be available to many laboratories. For this, an open exchange of information is required, like that already achieved for transcript and protein data. One of the key-steps in metabolite profiling is the unambiguous identification of metabolites in highly complex metabolite preparations from biological samples. Collections of mass spectra, which comprise frequently observed metabolites of either known or unknown exact chemical structure, represent the most effective means to pool the identification efforts currently performed in many laboratories around the world. Here we present GMD, The Golm Metabolome Database, an open access metabolome database, which should enable these processes. GMD provides public access to custom mass spectral libraries, metabolite profiling experiments as well as additional information and tools, e.g. with regard to methods, spectral information or compounds. The main goal will be the representation of an exchange platform for experimental research activities and bioinformatics to develop and improve metabolomics by multidisciplinary cooperation. AVAILABILITY: http://csbdb.mpimp-golm.mpg.de/gmd.html CONTACT: Steinhauser@mpimp-golm.mpg.de SUPPLEMENTARY INFORMATION: http://csbdb.mpimp-golm.mpg.de/
Recent rapid advances in next generation RNA sequencing (RNA-Seq)-based provide researchers with unprecedentedly large data sets and open new perspectives in transcriptomics. Furthermore, RNA-Seq-based transcript profiling can be applied to non-model and newly discovered organisms because it does not require a predefined measuring platform (like e.g. microarrays). However, these novel technologies pose new challenges: the raw data need to be rigorously quality checked and filtered prior to analysis, and proper statistical methods have to be applied to extract biologically relevant information. Given the sheer volume of data, this is no trivial task and requires a combination of considerable technical resources along with bioinformatics expertise. To aid the individual researcher, we have developed RobiNA as an integrated solution that consolidates all steps of RNA-Seq-based differential gene-expression analysis in one user-friendly cross-platform application featuring a rich graphical user interface. RobiNA accepts raw FastQ files, SAM/BAM alignment files and counts tables as input. It supports quality checking, flexible filtering and statistical analysis of differential gene expression based on state-of-the art biostatistical methods developed in the R/Bioconductor projects. In-line help and a step-by-step manual guide users through the analysis. Installer packages for Mac OS X, Windows and Linux are available under the LGPL licence from http://mapman.gabipd.org/web/guest/robin.
The diurnal cycle strongly influences many plant metabolic and physiological processes. Arabidopsis thaliana rosettes were harvested six times during 12-h-light/12-h-dark treatments to investigate changes in gene expression using ATH1 arrays. Diagnostic gene sets were identified from published or in-house expression profiles of the response to light, sugar, nitrogen, and water deficit in seedlings and 4 h of darkness or illumination at ambient or compensation point [CO(2)]. Many sugar-responsive genes showed large diurnal expression changes, whose timing matched that of the diurnal changes of sugars. A set of circadian-regulated genes also showed large diurnal changes in expression. Comparison of published results from a free-running cycle with the diurnal changes in Columbia-0 (Col-0) and the starchless phosphoglucomutase (pgm) mutant indicated that sugars modify the expression of up to half of the clock-regulated genes. Principle component analysis identified genes that make large contributions to diurnal changes and confirmed that sugar and circadian regulation are the major inputs in Col-0 but that sugars dominate the response in pgm. Most of the changes in pgm are triggered by low sugar levels during the night rather than high levels in the light, highlighting the importance of responses to low sugar in diurnal gene regulation. We identified a set of candidate regulatory genes that show robust responses to alterations in sugar levels and change markedly during the diurnal cycle.
MapMan is a user-driven tool that displays large genomics datasets onto diagrams of metabolic pathways or other processes. Here, we present new developments, including improvements of the gene assignments and the user interface, a strategy to visualize multilayered datasets, the incorporation of statistics packages, and extensions of the software to incorporate more biological information including visualization of corresponding genes and horizontal searches for similar global responses across large numbers of arrays.