Greengenes2 unifies microbial data in a single reference tree
Daniel McDonald(University of California San Diego), Yueyu Jiang(University of California San Diego), Metin Balaban(University of California San Diego), Kalen Cantrell(University of California San Diego), Qiyun Zhu(Arizona State University), Antonio González(University of California San Diego), James T. Morton(National Institutes of Health), Giorgia Nicolaou(University of California San Diego), Donovan H. Parks(The University of Queensland), Søren Michael Karst(Columbia University), Mads Albertsen(Aalborg University), Philip Hugenholtz(The University of Queensland), Todd Z. DeSantis, Se Jin Song(University of California San Diego), Andrew Bartko(University of California San Diego), Aki S. Havulinna(Finnish Institute for Health and Welfare), Pekka Jousilahti(Finnish Institute for Health and Welfare), Susan Cheng(Cedars-Sinai Medical Center), Michael Inouye(Baker Heart and Diabetes Institute), Teemu Niiranen(University of Turku), Mohit Jain, Veikko Salomaa(Finnish Institute for Health and Welfare), Leo Lahti(University of Turku), Siavash Mirarab(University of California San Diego), Rob Knight(University of California San Diego)
Cited by 571Open Access
Abstract
Studies using 16S rRNA and shotgun metagenomics typically yield different results, usually attributed to PCR amplification biases. We introduce Greengenes2, a reference tree that unifies genomic and 16S rRNA databases in a consistent, integrated resource. By inserting sequences into a whole-genome phylogeny, we show that 16S rRNA and shotgun metagenomic data generated from the same samples agree in principal coordinates space, taxonomy and phenotype effect size when analyzed with the same tree.