Pandora: nucleotide-resolution bacterial pan-genomics with reference graphs

Rachel Colquhoun(European Bioinformatics Institute), Michael B. Hall(European Bioinformatics Institute), Leandro Lima(European Bioinformatics Institute), Leah W. Roberts(European Bioinformatics Institute), Kerri M. Malone(European Bioinformatics Institute), Martin Hunt(European Bioinformatics Institute), Brice Letcher(European Bioinformatics Institute), Jane Hawkey(Monash University), Sophie George(University of Oxford), Louise Pankhurst(University of Oxford), Zamin Iqbal(European Bioinformatics Institute)
Genome biology
September 14, 2021
Cited by 65Open Access
Full Text

Abstract

We present pandora, a novel pan-genome graph structure and algorithms for identifying variants across the full bacterial pan-genome. As much bacterial adaptability hinges on the accessory genome, methods which analyze SNPs in just the core genome have unsatisfactory limitations. Pandora approximates a sequenced genome as a recombinant of references, detects novel variation and pan-genotypes multiple samples. Using a reference graph of 578 Escherichia coli genomes, we compare 20 diverse isolates. Pandora recovers more rare SNPs than single-reference-based tools, is significantly better than picking the closest RefSeq reference, and provides a stable framework for analyzing diverse samples without reference bias.


Related Papers