GALBA: Genome Annotation with Miniprot and AUGUSTUS

Tomáš Brůna(Joint Genome Institute), Heng Li(Harvard University), Joseph Guhlin(University of Otago), Daniel Honsel(Czech Academy of Sciences, Institute of Computer Science), Steffen Herbold(University of Passau), Mario Stanke(Universitätsmedizin Greifswald), Natalia Nenasheva(Universitätsmedizin Greifswald), Matthis Ebel(Universitätsmedizin Greifswald), Lars Gabriel(Universitätsmedizin Greifswald), Katharina J. Hoff(Universitätsmedizin Greifswald)
bioRxiv (Cold Spring Harbor Laboratory)
April 10, 2023
Cited by 5Open Access
Full Text

Abstract

Abstract The Earth Biogenome Project has rapidly increased the number of available eukaryotic genomes, but most released genomes continue to lack annotation of protein-coding genes. In addition, no transcriptome data is available for some genomes. Various gene annotation tools have been developed but each has its limitations. Here, we introduce GALBA, a fully automated pipeline that utilizes miniprot, a rapid protein- to-genome aligner, in combination with AUGUSTUS to predict genes with high accuracy. Accuracy results indicate that GALBA is particularly strong in the annotation of large vertebrate genomes. We also present use cases in insects, vertebrates, and a previously unannotated land plant. GALBA is fully open source and available as a docker image for easy execution with Singularity in high-performance computing environments. Our pipeline addresses the critical need for accurate gene annotation in newly sequenced genomes, and we believe that GALBA will greatly facilitate genome annotation for diverse organisms.


Related Papers

No related papers found

Powered by citation graph analysis