Virus Detection by High-Throughput Sequencing of Small RNAs: Large-Scale Performance Testing of Sequence Analysis Strategies

Sébastien Massart(Gembloux Agro-Bio Tech), Michela Chiumenti(Gembloux Agro-Bio Tech), Kris De Jonghe(Gembloux Agro-Bio Tech), R. Glover(Gembloux Agro-Bio Tech), Annelies Haegeman(Gembloux Agro-Bio Tech), Igor Koloniuk(Gembloux Agro-Bio Tech), Petr Komínek(Czech Agrifood Research Center), Jan Kreuze(Gembloux Agro-Bio Tech), Denis Kutnjak(Gembloux Agro-Bio Tech), Leonidas Lotos(Gembloux Agro-Bio Tech), François Maclot(Gembloux Agro-Bio Tech), Varvara I. Maliogka(Gembloux Agro-Bio Tech), Hans J. Maree(Gembloux Agro-Bio Tech), Thibaut Olivier(Gembloux Agro-Bio Tech), Antonio Olmos(Gembloux Agro-Bio Tech), Mikhail M. Pooggin(Gembloux Agro-Bio Tech), Jean-Sébastien Reynard(Gembloux Agro-Bio Tech), Ana Belén Ruiz-García(Gembloux Agro-Bio Tech), Dana Šafářová(Gembloux Agro-Bio Tech), Pierre H. H. Schneeberger(Gembloux Agro-Bio Tech), Noa Sela(Gembloux Agro-Bio Tech), Silvia Turco(Gembloux Agro-Bio Tech), Eeva J. Vainio(Gembloux Agro-Bio Tech), Éva Várallyay(Gembloux Agro-Bio Tech), Eric Verdin(Gembloux Agro-Bio Tech), Marcel Westenberg(Gembloux Agro-Bio Tech), Yves Brostaux(Gembloux Agro-Bio Tech), Thierry Candresse(Gembloux Agro-Bio Tech)
Phytopathology
August 2, 2018
Cited by 145Open Access
Full Text

Abstract

Recent developments in high-throughput sequencing (HTS), also called next-generation sequencing (NGS), technologies and bioinformatics have drastically changed research on viral pathogens and spurred growing interest in the field of virus diagnostics. However, the reliability of HTS-based virus detection protocols must be evaluated before adopting them for diagnostics. Many different bioinformatics algorithms aimed at detecting viruses in HTS data have been reported but little attention has been paid thus far to their sensitivity and reliability for diagnostic purposes. Therefore, we compared the ability of 21 plant virology laboratories, each employing a different bioinformatics pipeline, to detect 12 plant viruses through a double-blind large-scale performance test using 10 datasets of 21- to 24-nucleotide small RNA (sRNA) sequences from three different infected plants. The sensitivity of virus detection ranged between 35 and 100% among participants, with a marked negative effect when sequence depth decreased. The false-positive detection rate was very low and mainly related to the identification of host genome-integrated viral sequences or misinterpretation of the results. Reproducibility was high (91.6%). This work revealed the key influence of bioinformatics strategies for the sensitive detection of viruses in HTS sRNA datasets and, more specifically (i) the difficulty in detecting viral agents when they are novel or their sRNA abundance is low, (ii) the influence of key parameters at both assembly and annotation steps, (iii) the importance of completeness of reference sequence databases, and (iv) the significant level of scientific expertise needed when interpreting pipeline results. Overall, this work underlines key parameters and proposes recommendations for reliable sRNA-based detection of known and unknown viruses.


Related Papers

No related papers found

Powered by citation graph analysis