R

Robert J. Elshire

Cornell University

ORCID: 0000-0003-1753-6920

Publishes on Genetic Mapping and Diversity in Plants and Animals, Genetic diversity and population structure, Genomics and Phylogenetic Studies. 20 papers and 13.4k citations.

20Publications
13.4kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species
Cited by 6.7kOpen Access

Advances in next generation technologies have driven the costs of DNA sequencing down to the point that genotyping-by-sequencing (GBS) is now feasible for high diversity, large genome species. Here, we report a procedure for constructing GBS libraries based on reducing genome complexity with restriction enzymes (REs). This approach is simple, quick, extremely specific, highly reproducible, and may reach important regions of the genome that are inaccessible to sequence capture approaches. By using methylation-sensitive REs, repetitive regions of genomes can be avoided and lower copy regions targeted with two to three fold higher efficiency. This tremendously simplifies computationally challenging alignment problems in species with high levels of genetic diversity. The GBS procedure is demonstrated with maize (IBM) and barley (Oregon Wolfe Barley) recombinant inbred populations where roughly 200,000 and 25,000 sequence tags were mapped, respectively. An advantage in species like barley that lack a complete genome sequence is that a reference map need only be developed around the restriction sites, and this can be done in the process of sample genotyping. In such cases, the consensus of the read clusters across the sequence tagged sites becomes the reference. Alternatively, for kinship analyses in the absence of a reference genome, the sequence tags can simply be treated as dominant markers. Future application of GBS to breeding, conservation, and global species and population surveys may allow plant breeders to conduct genomic selection on a novel germplasm or species without first having to develop any prior molecular tools, or conservation biologists to determine population structure without prior knowledge of the genome or diversity in the species.

TASSEL-GBS: A High Capacity Genotyping by Sequencing Analysis Pipeline
Cited by 1.7kOpen Access

Genotyping by sequencing (GBS) is a next generation sequencing based method that takes advantage of reduced representation to enable high throughput genotyping of large numbers of individuals at a large number of SNP markers. The relatively straightforward, robust, and cost-effective GBS protocol is currently being applied in numerous species by a large number of researchers. Herein we describe a bioinformatics pipeline, TASSEL-GBS, designed for the efficient processing of raw GBS sequence data into SNP genotypes. The TASSEL-GBS pipeline successfully fulfills the following key design criteria: (1) Ability to run on the modest computing resources that are typically available to small breeding or ecological research programs, including desktop or laptop machines with only 8-16 GB of RAM, (2) Scalability from small to extremely large studies, where hundreds of thousands or even millions of SNPs can be scored in up to 100,000 individuals (e.g., for large breeding programs or genetic surveys), and (3) Applicability in an accelerated breeding context, requiring rapid turnover from tissue collection to genotypes. Although a reference genome is required, the pipeline can also be run with an unfinished "pseudo-reference" consisting of numerous contigs. We describe the TASSEL-GBS pipeline in detail and benchmark it based upon a large scale, species wide analysis in maize (Zea mays), where the average error rate was reduced to 0.0042 through application of population genetic-based SNP filters. Overall, the GBS assay and the TASSEL-GBS pipeline provide robust tools for studying genomic diversity.

A First-Generation Haplotype Map of Maize
Cited by 785

A-Maize-ing Maize is one of our oldest and most important crops, having been domesticated approximately 9000 years ago in central Mexico. Schnable et al. (p. 1112 ; see the cover) present the results of sequencing the B73 inbred maize line. The findings elucidate how maize became diploid after an ancestral doubling of its chromosomes and reveals transposable element movement and activity and recombination. Vielle-Calzada et al. (p. 1078 ) have sequenced the Palomero Toluqueño ( Palomero ) landrace, a highland popcorn from Mexico, which, when compared to the B73 line, reveals multiple loci impacted by domestication. Swanson-Wagner et al. (p. 1118 ) exploit possession of the genome to analyze expression differences occurring between lines. The identification of single nucleotide polymorphisms and copy number variations among lines was used by Gore et al. (p. 1115 ) to generate a Haplotype map of maize. While chromosomal diversity in maize is high, it is likely that recombination is the major force affecting the levels of heterozygosity in maize. The availability of the maize genome will help to guide future agricultural and biofuel applications (see the Perspective by Feuillet and Eversole ).

Switchgrass Genomic Diversity, Ploidy, and Evolution: Novel Insights from a Network-Based SNP Discovery Protocol
Fei Lü, Alexander E. Lipka, Jeff Glaubitz et al.|PLoS Genetics|2013
Cited by 678Open Access

Switchgrass (Panicum virgatum L.) is a perennial grass that has been designated as an herbaceous model biofuel crop for the United States of America. To facilitate accelerated breeding programs of switchgrass, we developed both an association panel and linkage populations for genome-wide association study (GWAS) and genomic selection (GS). All of the 840 individuals were then genotyped using genotyping by sequencing (GBS), generating 350 GB of sequence in total. As a highly heterozygous polyploid (tetraploid and octoploid) species lacking a reference genome, switchgrass is highly intractable with earlier methodologies of single nucleotide polymorphism (SNP) discovery. To access the genetic diversity of species like switchgrass, we developed a SNP discovery pipeline based on a network approach called the Universal Network-Enabled Analysis Kit (UNEAK). Complexities that hinder single nucleotide polymorphism discovery, such as repeats, paralogs, and sequencing errors, are easily resolved with UNEAK. Here, 1.2 million putative SNPs were discovered in a diverse collection of primarily upland, northern-adapted switchgrass populations. Further analysis of this data set revealed the fundamentally diploid nature of tetraploid switchgrass. Taking advantage of the high conservation of genome structure between switchgrass and foxtail millet (Setaria italica (L.) P. Beauv.), two parent-specific, synteny-based, ultra high-density linkage maps containing a total of 88,217 SNPs were constructed. Also, our results showed clear patterns of isolation-by-distance and isolation-by-ploidy in natural populations of switchgrass. Phylogenetic analysis supported a general south-to-north migration path of switchgrass. In addition, this analysis suggested that upland tetraploid arose from upland octoploid. All together, this study provides unparalleled insights into the diversity, genomic complexity, population structure, phylogeny, phylogeography, ploidy, and evolutionary dynamics of switchgrass.