A phylogeny-driven genomic encyclopaedia of Bacteria and Archaea

Dongying Wu(Joint Genome Institute), Philip Hugenholtz(Joint Genome Institute), Konstantinos Mavromatis(Joint Genome Institute), Rüdiger Pukall(Leibniz Institute DSMZ – German Collection of Microorganisms and Cell Cultures), Eileen Dalin(Joint Genome Institute), Natalia Ivanova(Joint Genome Institute), Victor Kunin(Joint Genome Institute), Lynne Goodwin(Joint Genome Institute), Martin Wu(University of Virginia), Brian J. Tindall(Leibniz Institute DSMZ – German Collection of Microorganisms and Cell Cultures), Sean Hooper(Joint Genome Institute), Amrita Pati(Joint Genome Institute), Athanasios Lykidis(Joint Genome Institute), Stefan Spring(Leibniz Institute DSMZ – German Collection of Microorganisms and Cell Cultures), Iain Anderson(Joint Genome Institute), Patrik D’haeseleer(Joint Genome Institute), Adam Zemła(Lawrence Livermore National Laboratory), Mitchell Singer(University of California, Davis), Alla Lapidus(Joint Genome Institute), Matt Nolan(Joint Genome Institute), Alex Copeland(Joint Genome Institute), Cliff Han(Joint Genome Institute), Feng Chen(Joint Genome Institute), Jan‐Fang Cheng(Joint Genome Institute), Susan Lucas(Joint Genome Institute), Cheryl A. Kerfeld(Joint Genome Institute), Elke Lang(Leibniz Institute DSMZ – German Collection of Microorganisms and Cell Cultures), Sabine Gronow(Leibniz Institute DSMZ – German Collection of Microorganisms and Cell Cultures), Patrick Chain(Joint Genome Institute), David Bruce(Los Alamos National Laboratory), Edward M. Rubin(Joint Genome Institute), Nikos C. Kyrpides(Joint Genome Institute), Hans‐Peter Klenk(Leibniz Institute DSMZ – German Collection of Microorganisms and Cell Cultures), Jonathan A. Eisen(University of California, Davis)
Nature
December 1, 2009
Cited by 980Open Access
Full Text

Abstract

The bacterial and archaeal genomes that have been sequenced to date were chosen for sequencing based mainly on their physiology, which is fine but has resulted in a distinct phylogenetic bias. An alternative approach has been taken in the Genomic Encyclopedia of Bacteria and Archaea (GEBA) project, which advocates choosing genomes based on the organism's phylogenetic position, with the aim filling in the gaps in sequencing along on bacterial and archaeal branches of the tree of life. The value of this approach has been demonstrated by a pilot study of the genome sequences of 56 culturable species selected to maximize phylogenetic coverage. Analysis of the sequences provides insights into phylogenetics, protein function and genome annotation. There are now nearly 1,000 completed bacterial and archaeal genomes available, but as most of them were chosen for sequencing on the basis of their physiology, the data are limited by a highly biased phylogenetic distribution. To explore the value added by choosing microbial genomes for sequencing on the basis of their evolutionary relationships, the genomes of 56 species of Bacteria and Archaea selected to maximize phylogenetic coverage are now sequenced and analysed. Sequencing of bacterial and archaeal genomes has revolutionized our understanding of the many roles played by microorganisms1. There are now nearly 1,000 completed bacterial and archaeal genomes available2, most of which were chosen for sequencing on the basis of their physiology. As a result, the perspective provided by the currently available genomes is limited by a highly biased phylogenetic distribution3,4,5. To explore the value added by choosing microbial genomes for sequencing on the basis of their evolutionary relationships, we have sequenced and analysed the genomes of 56 culturable species of Bacteria and Archaea selected to maximize phylogenetic coverage. Analysis of these genomes demonstrated pronounced benefits (compared to an equivalent set of genomes randomly selected from the existing database) in diverse areas including the reconstruction of phylogenetic history, the discovery of new protein families and biological properties, and the prediction of functions for known genes from other organisms. Our results strongly support the need for systematic ‘phylogenomic’ efforts to compile a phylogeny-driven ‘Genomic Encyclopedia of Bacteria and Archaea’ in order to derive maximum knowledge from existing microbial genome data as well as from genome sequences to come.


Related Papers

No related papers found

Powered by citation graph analysis