GSA: Genome Sequence Archive

Yanqing Wang(Beijing Institute of Genomics), Fuhai Song(Chinese Academy of Sciences), Junwei Zhu(Beijing Institute of Genomics), Sisi Zhang(Beijing Institute of Genomics), Yadong Yang(Chinese Academy of Sciences), Tingting Chen(Beijing Institute of Genomics), Bixia Tang(Beijing Institute of Genomics), Lili Dong(Beijing Institute of Genomics), Nan Ding(Chinese Academy of Sciences), Qian Zhang(Chinese Academy of Sciences), Zhouxian Bai(Chinese Academy of Sciences), Xunong Dong(Chinese Academy of Sciences), Huanxin Chen(Beijing Institute of Genomics), Mingyuan Sun(Beijing Institute of Genomics), Shuang Zhai(Beijing Institute of Genomics), Yubin Sun(Beijing Institute of Genomics), Lei Yu(Beijing Institute of Genomics), Lan Li(Beijing Institute of Genomics), Jingfa Xiao(Chinese Academy of Sciences), Xiangdong Fang(Chinese Academy of Sciences), Hongxing Lei(Chinese Academy of Sciences), Zhang Zhang(Chinese Academy of Sciences), Wenming Zhao(Fudan University)
Genomics Proteomics & Bioinformatics
February 1, 2017
Cited by 833Open Access
Full Text

Abstract

With the rapid development of sequencing technologies towards higher throughput and lower cost, sequence data are generated at an unprecedentedly explosive rate. To provide an efficient and easy-to-use platform for managing huge sequence data, here we present Genome Sequence Archive (GSA; http://bigd.big.ac.cn/gsa or http://gsa.big.ac.cn), a data repository for archiving raw sequence data. In compliance with data standards and structures of the International Nucleotide Sequence Database Collaboration (INSDC), GSA adopts four data objects (BioProject, BioSample, Experiment, and Run) for data organization, accepts raw sequence reads produced by a variety of sequencing platforms, stores both sequence reads and metadata submitted from all over the world, and makes all these data publicly available to worldwide scientific communities. In the era of big data, GSA is not only an important complement to existing INSDC members by alleviating the increasing burdens of handling sequence data deluge, but also takes the significant responsibility for global big data archive and provides free unrestricted access to all publicly available data in support of research activities throughout the world.


Related Papers