Seeker: alignment-free identification of bacteriophage genomes by deep learning

Noam Auslander(National Institutes of Health), Ayal B. Gussow(National Institutes of Health), Sean Benler(National Institutes of Health), Yuri I. Wolf(National Institutes of Health), Eugene V. Koonin(National Institutes of Health)
Nucleic Acids Research
September 22, 2020
Cited by 136Open Access
Full Text

Abstract

Recent advances in metagenomic sequencing have enabled discovery of diverse, distinct microbes and viruses. Bacteriophages, the most abundant biological entity on Earth, evolve rapidly, and therefore, detection of unknown bacteriophages in sequence datasets is a challenge. Most of the existing detection methods rely on sequence similarity to known bacteriophage sequences, impeding the identification and characterization of distinct, highly divergent bacteriophage families. Here we present Seeker, a deep-learning tool for alignment-free identification of phage sequences. Seeker allows rapid detection of phages in sequence datasets and differentiation of phage sequences from bacterial ones, even when those phages exhibit little sequence similarity to established phage families. We comprehensively validate Seeker's ability to identify previously unidentified phages, and employ this method to detect unknown phages, some of which are highly divergent from the known phage families. We provide a web portal (seeker.pythonanywhere.com) and a user-friendly Python package (github.com/gussow/seeker) allowing researchers to easily apply Seeker in metagenomic studies, for the detection of diverse unknown bacteriophages.


Related Papers

No related papers found

Powered by citation graph analysis