Coronavirus GenBrowser for monitoring the transmission and evolution of SARS-CoV-2

Dalang Yu(Shanghai Institute of Nutrition and Health), Xiao Yang(Shanghai Institute of Nutrition and Health), Bixia Tang(Beijing Institute of Genomics), Yi-Hsuan Pan(East China Normal University), Jianing Yang(Shanghai Institute of Nutrition and Health), Guangya Duan(Beijing Institute of Genomics), Junwei Zhu(Beijing Institute of Genomics), Zi-Qian Hao(Shanghai Institute of Nutrition and Health), Hailong Mu(Shanghai Institute of Nutrition and Health), Long Dai(Shanghai Institute of Nutrition and Health), Wangjie Hu(Shanghai Institute of Nutrition and Health), Mochen Zhang(Beijing Institute of Genomics), Ying Cui(Beijing Institute of Genomics), Tong Jin(Beijing Institute of Genomics), Cuiping Li(Beijing Institute of Genomics), Lina Ma(Beijing Institute of Genomics), Language translation team(Chinese Academy of Sciences), Xiao Su(Chinese Academy of Sciences), Guoqing Zhang(Beijing Institute of Genomics), Wenming Zhao(Beijing Institute of Genomics), Haipeng Li(Shanghai Institute of Nutrition and Health)
medRxiv
December 24, 2020
Cited by 3Open Access
Full Text

Abstract

Abstract Genomic epidemiology is important to study the COVID-19 pandemic and more than two million SARS-CoV-2 genomic sequences were deposited into public databases. However, the exponential increase of sequences invokes unprecedented bioinformatic challenges. Here, we present the Coronavirus GenBrowser (CGB) based on a highly efficient analysis framework and a movie maker strategy. In total, 1,002,739 high quality genomic sequences with the transmission-related metadata were analyzed and visualized. The size of the core data file is only 12.20 MB, efficient for clean data sharing. Quick visualization modules and rich interactive operations are provided to explore the annotated SARS-CoV-2 evolutionary tree. CGB binary nomenclature is proposed to name each internal lineage. The pre-analyzed data can be filtered out according to the user-defined criteria to explore the transmission of SARS-CoV-2. Different evolutionary analyses can also be easily performed, such as the detection of accelerated evolution and on-going positive selection. Moreover, the 75 genomic spots conserved in SARS-CoV-2 but non-conserved in other coronaviruses were identified, which may indicate the functional elements specifically important for SARS-CoV-2. The CGB not only enables users who have no programming skills to analyze millions of genomic sequences, but also offers a panoramic vision of the transmission and evolution of SARS-CoV-2.


Related Papers

No related papers found

Powered by citation graph analysis