A Reference Viral Database (RVDB) To Enhance Bioinformatics Analysis of High-Throughput Sequencing for Novel Virus Detection

Norman Goodacre(Center for Biologics Evaluation and Research), Aisha AlJanahi(Center for Biologics Evaluation and Research), Subhiksha Nandakumar(Center for Biologics Evaluation and Research), Mike Mikailov(United States Food and Drug Administration), Arifa S. Khan(Center for Biologics Evaluation and Research)
mSphere
March 13, 2018
Cited by 262Open Access
Full Text

Abstract

To facilitate bioinformatics analysis of high-throughput sequencing (HTS) data for the detection of both known and novel viruses, we have developed a new reference viral database (RVDB) that provides a broad representation of different virus species from eukaryotes by including all viral, virus-like, and virus-related sequences (excluding bacteriophages), regardless of their size. In particular, RVDB contains endogenous nonretroviral elements, endogenous retroviruses, and retrotransposons. Sequences were clustered to reduce redundancy while retaining high viral sequence diversity. A particularly useful feature of RVDB is the reduction of cellular sequences, which can enhance the run efficiency of large transcriptomic and genomic data analysis and increase the specificity of virus detection.


Related Papers

No related papers found

Powered by citation graph analysis