Hans-Henrik Stærfeldt

RNAmmer: consistent and rapid annotation of ribosomal RNA genes

Karin Lagesen, Peter F. Hallin, Einar Andreas Rødland et al.|Nucleic Acids Research|2007

Cited by 6.3kOpen Access

The publication of a complete genome sequence is usually accompanied by annotations of its genes. In contrast to protein coding genes, genes for ribosomal RNA (rRNA) are often poorly or inconsistently annotated. This makes comparative studies based on rRNA genes difficult. We have therefore created computational predictors for the major rRNA species from all kingdoms of life and compiled them into a program called RNAmmer. The program uses hidden Markov models trained on data from the 5S ribosomal RNA database and the European ribosomal RNA database project. A pre-screening step makes the method fast with little loss of sensitivity, enabling the analysis of a complete bacterial genome in less than a minute. Results from running RNAmmer on a large set of genomes indicate that the location of rRNAs can be predicted with a very high level of accuracy. Novel, unannotated rRNAs are also predicted in many genomes. The software as well as the genome analysis results are available at the CBS web server.

A comprehensive and quantitative comparison of text-mining in 15 million full-text articles versus their corresponding abstracts

David Westergaard, Hans-Henrik Stærfeldt, Christian Tønsberg et al.|PLoS Computational Biology|2018

Cited by 172Open Access

Across academia and industry, text mining has become a popular strategy for keeping up with the rapid growth of the scientific literature. Text mining of the scientific literature has mostly been carried out on collections of abstracts, due to their availability. Here we present an analysis of 15 million English scientific full-text articles published during the period 1823-2016. We describe the development in article length and publication sub-topics during these nearly 250 years. We showcase the potential of text mining by extracting published protein-protein, disease-gene, and protein subcellular associations using a named entity recognition system, and quantitatively report on their accuracy using gold standard benchmark data sets. We subsequently compare the findings to corresponding results obtained on 16.5 million abstracts included in MEDLINE and show that text mining of full-text articles consistently outperforms using abstracts only.

Genome Update: proteome comparisons

Tim T. Binnewies, Peter F. Hallin, Hans-Henrik Stærfeldt et al.|Microbiology|2005

Cited by 36

Microbiology Society journals contain high-quality research papers and topical review articles. We are a not-for-profit publisher and we support and invest in the microbiology community, to the benefit of everyone. This supports our principal goal to develop, expand and strengthen the networks available to our members so that they can generate new knowledge about microbes and ensure that it is shared with other communities.

FeatureMap3D--a tool to map protein features and sequence conservation onto homologous structures in the PDB

Rasmus Wernersson, K. Rapacki, Hans-Henrik Stærfeldt et al.|Nucleic Acids Research|2006

Cited by 18Open Access

FeatureMap3D is a web-based tool that maps protein features onto 3D structures. The user provides sequences annotated with any feature of interest, such as post-translational modifications, protease cleavage sites or exonic structure and FeatureMap3D will then search the Protein Data Bank (PDB) for structures of homologous proteins. The results are displayed both as an annotated sequence alignment, where the user-provided annotations as well as the sequence conservation between the query and the target sequence are displayed, and also as a publication-quality image of the 3D protein structure with the selected features and sequence conservation enhanced. The results are also returned in a readily parsable text format as well as a PyMol (http://pymol.sourceforge.net/) script file, which allows the user to easily modify the protein structure image to suit a specific purpose. FeatureMap3D can also be used without sequence annotation, to evaluate the quality of the alignment of the input sequences to the most homologous structures in the PDB, through the sequence conservation colored 3D structure visualization tool. FeatureMap3D is available at: http://www.cbs.dtu.dk/services/FeatureMap3D/.

Hans-Henrik Stærfeldt

Is this you? Claim your profile.

Top publicationsby citations