Multiple reference genomes and transcriptomes for Arabidopsis thaliana

Xiangchao Gan(University of Oxford), Oliver Stegle(Max Planck Institute for Developmental Biology), Jonas Behr(Max Planck Society), Joshua G. Steffen(University of Utah), Philipp Drewe(Friedrich Miescher Laboratory), Katie L. Hildebrand(Kansas State University), Rune Lyngsoe(University of Oxford), Sebastian J. Schultheiß(Max Planck Society), Edward J. Osborne(University of Utah), Vipin T. Sreedharan(Friedrich Miescher Laboratory), André Kahles(Friedrich Miescher Laboratory), Regina Bohnert(Friedrich Miescher Laboratory), Géraldine Jean(Max Planck Society), Paul Derwent(Wellcome Trust), Paul Kersey(European Bioinformatics Institute), Eric J. Belfield(University of Oxford), Nicholas P. Harberd(University of Oxford), Eric Kemen(Sainsbury Laboratory), Christopher Toomajian(Kansas State University), Paula X. Kover(University of Bath), Richard M. Clark(University of Utah), Gunnar Rätsch(Friedrich Miescher Laboratory), Richard Mott(University of Oxford)
Nature
August 26, 2011
Cited by 683Open Access
Full Text

Abstract

Genetic differences between Arabidopsis thaliana accessions underlie the plant’s extensive phenotypic variation, and until now these have been interpreted largely in the context of the annotated reference accession Col-0. Here we report the sequencing, assembly and annotation of the genomes of 18 natural A. thaliana accessions, and their transcriptomes. When assessed on the basis of the reference annotation, one-third of protein-coding genes are predicted to be disrupted in at least one accession. However, re-annotation of each genome revealed that alternative gene models often restore coding potential. Gene expression in seedlings differed for nearly half of expressed genes and was frequently associated with cis variants within 5 kilobases, as were intron retention alternative splicing events. Sequence and expression variation is most pronounced in genes that respond to the biotic environment. Our data further promote evolutionary and functional studies in A. thaliana, especially the MAGIC genetic reference population descended from these accessions. The genomes and transcriptomes of 18 natural Arabidopsis thaliana strains have been compared with that of Col-0, the most widely used A. thaliana wild type that was sequenced as part of the Arabidopsis Genome Initiative. The comparison has been used to create a comprehensive overview of genetic variability in this classic 'laboratory' plant. Each individual genome was compared with every other individual genome in a 'many-to-many' approach, which maximizes the capture of gene variations.


Related Papers

No related papers found

Powered by citation graph analysis