Genome Modeling System: A Knowledge Management Platform for Genomics

Malachi Griffith(Washington University in St. Louis), Obi L. Griffith(Washington University in St. Louis), Scott M. Smith(Washington University in St. Louis), Avinash Ramu(Washington University in St. Louis), Matthew B. Callaway(Washington University in St. Louis), Anthony M. Brummett(Washington University in St. Louis), Michael J. Kiwala(Washington University in St. Louis), Adam Coffman(Washington University in St. Louis), Allison Regier(Washington University in St. Louis), Ben J. Oberkfell(Washington University in St. Louis), Gabriel E. Sanderson(Washington University in St. Louis), Thomas P. Mooney(Washington University in St. Louis), Nathaniel G. Nutter(Washington University in St. Louis), Edward A. Belter(Washington University in St. Louis), Feiyu Du(Washington University in St. Louis), Robert L. Long(Washington University in St. Louis), Travis E. Abbott(Washington University in St. Louis), Ian Ferguson(Washington University in St. Louis), David Morton(Washington University in St. Louis), Mark M. Burnett(Washington University in St. Louis), James V. Weible(Washington University in St. Louis), Joshua B. Peck(Washington University in St. Louis), Adam F. Dukes(Washington University in St. Louis), Joshua F. McMichael(Washington University in St. Louis), Justin T. Lolofie(Washington University in St. Louis), Brian R. Derickson(Washington University in St. Louis), Jasreet Hundal(Washington University in St. Louis), Zachary L. Skidmore(Washington University in St. Louis), Benjamin J. Ainscough(Washington University in St. Louis), Nathan D. Dees(Washington University in St. Louis), William Schierding(Washington University in St. Louis), Cyriac Kandoth(Washington University in St. Louis), Kyung H. Kim(Washington University in St. Louis), Charles Lu(Washington University in St. Louis), Christopher Harris(Washington University in St. Louis), Nicole Maher(Washington University in St. Louis), Christopher A. Maher(Washington University in St. Louis), Vincent Magrini(Washington University in St. Louis), Benjamin Abbott(Washington University in St. Louis), Ken Chen(Washington University in St. Louis), Eric M. Clark(Washington University in St. Louis), Indraniel Das(Washington University in St. Louis), Xian Fan(Washington University in St. Louis), Amy Hawkins(Washington University in St. Louis), Todd G. Hepler(Washington University in St. Louis), Todd Wylie(Washington University in St. Louis), Shawn Leonard(Washington University in St. Louis), W.E. Schroeder(Washington University in St. Louis), Xiaoqi Shi(Washington University in St. Louis), Lynn K. Carmichael(Washington University in St. Louis), Matthew R. Weil(Washington University in St. Louis), Richard W. Wohlstadter(Washington University in St. Louis), Gary Stiehr(Washington University in St. Louis), Michael D. McLellan(Washington University in St. Louis), Craig Pohl(Washington University in St. Louis), Christopher A. Miller(Washington University in St. Louis), Daniel C. Koboldt(Washington University in St. Louis), Jason Walker(Washington University in St. Louis), James M. Eldred(Washington University in St. Louis), David E. Larson(Washington University in St. Louis), David J. Dooling(Washington University in St. Louis), Li Ding(Washington University in St. Louis), Elaine R. Mardis(Washington University in St. Louis), Richard K. Wilson(Washington University in St. Louis)
PLoS Computational Biology
July 9, 2015
Cited by 90Open Access
Full Text

Abstract

In this work, we present the Genome Modeling System (GMS), an analysis information management system capable of executing automated genome analysis pipelines at a massive scale. The GMS framework provides detailed tracking of samples and data coupled with reliable and repeatable analysis pipelines. The GMS also serves as a platform for bioinformatics development, allowing a large team to collaborate on data analysis, or an individual researcher to leverage the work of others effectively within its data management system. Rather than separating ad-hoc analysis from rigorous, reproducible pipelines, the GMS promotes systematic integration between the two. As a demonstration of the GMS, we performed an integrated analysis of whole genome, exome and transcriptome sequencing data from a breast cancer cell line (HCC1395) and matched lymphoblastoid line (HCC1395BL). These data are available for users to test the software, complete tutorials and develop novel GMS pipeline configurations. The GMS is available at https://github.com/genome/gms.


Related Papers

No related papers found

Powered by citation graph analysis