Producing polished prokaryotic pangenomes with the Panaroo pipeline

Gerry Tonkin‐Hill(University of Oslo), Neil MacAlasdair(University of Cambridge), Christopher Ruis(MRC Laboratory of Molecular Biology), Aaron Weimann(European Bioinformatics Institute), Gal Horesh(Wellcome Sanger Institute), John A. Lees(Imperial College London), Rebecca A. Gladstone(University of Oslo), Stephanie W. Lo(Wellcome Sanger Institute), Christopher A. Beaudoin(University of Cambridge), R. Andrés Floto(University of Cambridge), Simon D. W. Frost(Microsoft (United States)), Jukka Corander(University of Helsinki), Stephen D. Bentley(Wellcome Sanger Institute), Julian Parkhill(University of Cambridge)
Genome biology
July 22, 2020
Cited by 1,283Open Access
Full Text

Abstract

Population-level comparisons of prokaryotic genomes must take into account the substantial differences in gene content resulting from horizontal gene transfer, gene duplication and gene loss. However, the automated annotation of prokaryotic genomes is imperfect, and errors due to fragmented assemblies, contamination, diverse gene families and mis-assemblies accumulate over the population, leading to profound consequences when analysing the set of all genes found in a species. Here, we introduce Panaroo, a graph-based pangenome clustering tool that is able to account for many of the sources of error introduced during the annotation of prokaryotic genome assemblies. Panaroo is available at https://github.com/gtonkinhill/panaroo .


Related Papers

No related papers found

Powered by citation graph analysis