The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)

Ross Overbeek; Robert Olson; Gordon D. Pusch; G J Olsen; James J. Davis; Terry Disz; Robert A. Edwards; Svetlana Gerdes; Bruce Parrello; Maulik Shukla; Veronika Vonstein; Alice R. Wattam; Fangfang Xia; Rick Stevens

doi:10.1093/nar/gkt1226

The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)

Ross Overbeek(University of Illinois Urbana-Champaign), Robert Olson(University of Chicago), Gordon D. Pusch(Argonne National Laboratory), G J Olsen(Argonne National Laboratory), James J. Davis(University of Illinois Chicago), Terry Disz(University of Illinois Chicago), Robert A. Edwards(San Diego State University), Svetlana Gerdes(University of Illinois Urbana-Champaign), Bruce Parrello(Argonne National Laboratory), Maulik Shukla(University of Illinois Urbana-Champaign), Veronika Vonstein(University of Illinois Urbana-Champaign), Alice R. Wattam(University of Illinois Urbana-Champaign), Fangfang Xia(Argonne National Laboratory), Rick Stevens(Argonne National Laboratory)

Nucleic Acids Research

November 29, 2013

10.1093/nar/gkt1226

Cited by 4,619Open Access

Full Text

Abstract

In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources.

Related Papers

No related papers found

Powered by citation graph analysis