Whole-genome sequencing and intensive analysis of the undomesticated soybean ( <i>Glycine soja</i> Sieb. and Zucc.) genome

Moon Young Kim(Seoul National University), Sunghoon Lee(Seoul National University), Kyujung Van(Seoul National University), Tae‐Hyung Kim(Korea Research Institute of Bioscience and Biotechnology), Soon‐Chun Jeong(Korea Research Institute of Bioscience and Biotechnology), Ik‐Young Choi(Seoul National University), Dae-Soo Kim(Seoul National University), Yong‐Seok Lee(Seoul National University), Daeui Park(Korea Research Institute of Bioscience and Biotechnology), Jianxin Ma(Purdue University West Lafayette), Woo-Yeon Kim(Korea Research Institute of Bioscience and Biotechnology), Byoung-Chul Kim(Korea Research Institute of Bioscience and Biotechnology), Sung‐Jin Park(Korea Research Institute of Bioscience and Biotechnology), Kyung‐A Lee(Korea Research Institute of Bioscience and Biotechnology), Dong Hyun Kim(Seoul National University), Kil Hyun Kim(Seoul National University), Jin Hee Shin(Seoul National University), Young Eun Jang(Seoul National University), Kyung Do Kim(Seoul National University), Wei Xian Liu(Seoul National University), Tanapon Chaisan(Seoul National University), Yang Jae Kang(Seoul National University), Yeong-Ho Lee(Seoul National University), Kook‐Hyung Kim(Seoul National University), Jung‐Kyung Moon(Rural Development Administration), Jeremy Schmutz(HudsonAlpha Institute for Biotechnology), Scott A. Jackson(Purdue University West Lafayette), Jong Bhak(Korea Research Institute of Bioscience and Biotechnology), Suk‐Ha Lee(Seoul National University)
Proceedings of the National Academy of Sciences
December 3, 2010
Cited by 335Open Access
Full Text

Abstract

The genome of soybean (Glycine max), a commercially important crop, has recently been sequenced and is one of six crop species to have been sequenced. Here we report the genome sequence of G. soja, the undomesticated ancestor of G. max (in particular, G. soja var. IT182932). The 48.8-Gb Illumina Genome Analyzer (Illumina-GA) short DNA reads were aligned to the G. max reference genome and a consensus was determined for G. soja. This consensus sequence spanned 915.4 Mb, representing a coverage of 97.65% of the G. max published genome sequence and an average mapping depth of 43-fold. The nucleotide sequence of the G. soja genome, which contains 2.5 Mb of substituted bases and 406 kb of small insertions/deletions relative to G. max, is ∼0.31% different from that of G. max. In addition to the mapped 915.4-Mb consensus sequence, 32.4 Mb of large deletions and 8.3 Mb of novel sequence contigs in the G. soja genome were also detected. Nucleotide variants of G. soja versus G. max confirmed by Roche Genome Sequencer FLX sequencing showed a 99.99% concordance in single-nucleotide polymorphism and a 98.82% agreement in insertion/deletion calls on Illumina-GA reads. Data presented in this study suggest that the G. soja/G. max complex may be at least 0.27 million y old, appearing before the relatively recent event of domestication (6,000∼9,000 y ago). This suggests that soybean domestication is complicated and that more in-depth study of population genetics is needed. In any case, genome comparison of domesticated and undomesticated forms of soybean can facilitate its improvement.


Related Papers

No related papers found

Powered by citation graph analysis