M

Mario Stanke

Universitätsmedizin Greifswald

ORCID: 0000-0001-8696-0384

Publishes on Genomics and Phylogenetic Studies, RNA and protein synthesis mechanisms, Machine Learning in Bioinformatics. 151 papers and 31.7k citations.

151Publications
31.7kTotal Citations
#1in RNA-seq

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

AUGUSTUS: ab initio prediction of alternative transcripts
Mario Stanke, O. Keller, Irfan Gunduz et al.|Nucleic Acids Research|2006
Cited by 3kOpen Access

AUGUSTUS is a software tool for gene prediction in eukaryotes based on a Generalized Hidden Markov Model, a probabilistic model of a sequence and its gene structure. Like most existing gene finders, the first version of AUGUSTUS returned one transcript per predicted gene and ignored the phenomenon of alternative splicing. Herein, we present a WWW server for an extended version of AUGUSTUS that is able to predict multiple splice variants. To our knowledge, this is the first ab initio gene finder that can predict multiple transcripts. In addition, we offer a motif searching facility, where user-defined regular expressions can be searched against putative proteins encoded by the predicted genes. The AUGUSTUS web interface and the downloadable open-source stand-alone program are freely available from http://augustus.gobics.de.

Using native and syntenically mapped cDNA alignments to improve <i>de novo</i> gene finding
Mario Stanke, Mark Diekhans, Robert Baertsch et al.|Bioinformatics|2008
Cited by 2.6kOpen Access

MOTIVATION: Computational annotation of protein coding genes in genomic DNA is a widely used and essential tool for analyzing newly sequenced genomes. However, current methods suffer from inaccuracy and do poorly with certain types of genes. Including additional sources of evidence of the existence and structure of genes can improve the quality of gene predictions. For many eukaryotic genomes, expressed sequence tags (ESTs) are available as evidence for genes. Related genomes that have been sequenced, annotated, and aligned to the target genome provide evidence of existence and structure of genes. RESULTS: We incorporate several different evidence sources into the gene finder AUGUSTUS. The sources of evidence are gene and transcript annotations from related species syntenically mapped to the target genome using TransMap, evolutionary conservation of DNA, mRNA and ESTs of the target species, and retroposed genes. The predictions include alternative splice variants where evidence supports it. Using only ESTs we were able to correctly predict at least one splice form exactly correct in 57% of human genes. Also using evidence from other species and human mRNAs, this number rises to 77%. Syntenic mapping is well-suited to annotate genomes closely related to genomes that are already annotated or for which extensive transcript evidence is available. Native cDNA evidence is most helpful when the alignments are used as compound information rather than independent positionwise information. AVAILABILITY: AUGUSTUS is open source and available at http://augustus.gobics.de. The gene predictions for human can be browsed and downloaded at the UCSC Genome Browser (http://genome.ucsc.edu).

BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database
Tomáš Brůna, Katharina J. Hoff, Alexandre Lomsadze et al.|NAR Genomics and Bioinformatics|2021
Cited by 1.8kOpen Access

The task of eukaryotic genome annotation remains challenging. Only a few genomes could serve as standards of annotation achieved through a tremendous investment of human curation efforts. Still, the correctness of all alternative isoforms, even in the best-annotated genomes, could be a good subject for further investigation. The new BRAKER2 pipeline generates and integrates external protein support into the iterative process of training and gene prediction by GeneMark-EP+ and AUGUSTUS. BRAKER2 continues the line started by BRAKER1 where self-training GeneMark-ET and AUGUSTUS made gene predictions supported by transcriptomic data. Among the challenges addressed by the new pipeline was a generation of reliable hints to protein-coding exon boundaries from likely homologous but evolutionarily distant proteins. In comparison with other pipelines for eukaryotic genome annotation, BRAKER2 is fully automatic. It is favorably compared under equal conditions with other pipelines, e.g. MAKER2, in terms of accuracy and performance. Development of BRAKER2 should facilitate solving the task of harmonization of annotation of protein-coding genes in genomes of different eukaryotic species. However, we fully understand that several more innovations are needed in transcriptomic and proteomic technologies as well as in algorithmic development to reach the goal of highly accurate annotation of eukaryotic genomes.

Gene prediction with a hidden Markov model and a new intron submodel
Mario Stanke, Stephan Waack|Bioinformatics|2003
Cited by 1.8k

MOTIVATION: The problem of finding the genes in eukaryotic DNA sequences by computational methods is still not satisfactorily solved. Gene finding programs have achieved relatively high accuracy on short genomic sequences but do not perform well on longer sequences with an unknown number of genes in them. Here existing programs tend to predict many false exons. RESULTS: We have developed a new program, AUGUSTUS, for the ab initio prediction of protein coding genes in eukaryotic genomes. The program is based on a Hidden Markov Model and integrates a number of known methods and submodels. It employs a new way of modeling intron lengths. We use a new donor splice site model, a new model for a short region directly upstream of the donor splice site model that takes the reading frame into account and apply a method that allows better GC-content dependent parameter estimation. AUGUSTUS predicts on longer sequences far more human and drosophila genes accurately than the ab initio gene prediction programs we compared it with, while at the same time being more specific. AVAILABILITY: A web interface for AUGUSTUS and the executable program are located at http://augustus.gobics.de.

AUGUSTUS: a web server for gene prediction in eukaryotes that allows user-defined constraints
Mario Stanke, Burkhard Morgenstern|Nucleic Acids Research|2005
Cited by 1.7kOpen Access

We present a WWW server for AUGUSTUS, a software for gene prediction in eukaryotic genomic sequences that is based on a generalized hidden Markov model, a probabilistic model of a sequence and its gene structure. The web server allows the user to impose constraints on the predicted gene structure. A constraint can specify the position of a splice site, a translation initiation site or a stop codon. Furthermore, it is possible to specify the position of known exons and intervals that are known to be exonic or intronic sequence. The number of constraints is arbitrary and constraints can be combined in order to pin down larger parts of the predicted gene structure. The result then is the most likely gene structure that complies with all given user constraints, if such a gene structure exists. The specification of constraints is useful when part of the gene structure is known, e.g. by expressed sequence tag or protein sequence alignments, or if the user wants to change the default prediction. The web interface and the downloadable stand-alone program are available free of charge at http://augustus.gobics.de/submission.

Similar Researchers

Coming soon — researchers in similar fields and career stages