The Genomic Landscapes of Human Breast and Colorectal CancersHuman cancer is caused by the accumulation of mutations in oncogenes and tumor suppressor genes. To catalog the genetic changes that occur during tumorigenesis, we isolated DNA from 11 breast and 11 colorectal tumors and determined the sequences of the genes in the Reference Sequence database in these samples. Based on analysis of exons representing 20,857 transcripts from 18,191 genes, we conclude that the genomic landscapes of breast and colorectal cancers are composed of a handful of commonly mutated gene "mountains" and a much larger number of gene "hills" that are mutated at low frequency. We describe statistical and bioinformatic tools that may help identify mutations with a role in tumorigenesis. These results have implications for understanding the nature and heterogeneity of human cancers and for using personal genomics for tumor diagnosis and therapy.
Human Genome Sequencing Using Unchained Base Reads on Self-Assembling DNA NanoarraysGenome sequencing of large numbers of individuals promises to advance the understanding, treatment, and prevention of human diseases, among other applications. We describe a genome sequencing platform that achieves efficient imaging and low reagent consumption with combinatorial probe anchor ligation chemistry to independently assay each base from patterned nanoarrays of self-assembling DNA nanoballs. We sequenced three human genomes with this platform, generating an average of 45- to 87-fold coverage per genome and identifying 3.2 to 4.5 million sequence variants per genome. Validation of one genome data set demonstrates a sequence accuracy of about 1 false variant per 100 kilobases. The high accuracy, affordable cost of $4400 for sequencing consumables, and scalability of this platform enable complete human genome sequencing for the detection of rare variants in large-scale genetic studies.
A sequence-based variation map of 8.27 million SNPs in inbred mouse strainsSimple repetitive DNA sequences from primates: Compilation and analysisJerzy Jurka, Charit L. Pethiyagoda|Journal of Molecular Evolution|1995 A HapMap leads to a Capsicum annuum SNP infinium array: a new tool for pepper breedingThe Capsicum genus (Pepper) is a part of the Solanacae family. It has been important in many cultures worldwide for its key nutritional components and uses as spices, medicines, ornamentals and vegetables. Worldwide population growth is associated with demand for more nutritionally valuable vegetables while contending with decreasing resources and available land. These conditions require increased efficiency in pepper breeding to deal with these imminent challenges. Through resequencing of inbred lines we have completed a valuable haplotype map (HapMap) for the pepper genome based on single-nucleotide polymorphisms (SNP). The identified SNPs were annotated and classified based on their gene annotation in the pepper draft genome sequence and phenotype of the sequenced inbred lines. A selection of one marker per gene model was utilized to create the PepperSNP16K array, which simultaneously genotyped 16 405 SNPs, of which 90.7% were found to be informative. A set of 84 inbred and hybrid lines and a mapping population of 90 interspecific F2 individuals were utilized to validate the array. Diversity analysis of the inbred lines shows a distinct separation of bell versus chile/hot pepper types and separates them into five distinct germplasm groups. The interspecific population created between Tabasco (C. frutescens chile type) and P4 (C. annuum blocky type) produced a linkage map with 5546 markers separated into 1361 bins on twelve 12 linkage groups representing 1392.3 cM. This publically available genotyping platform can be used to rapidly assess a large number of markers in a reproducible high-throughput manner for pepper. As a standardized tool for genetic analyses, the PepperSNP16K can be used worldwide to share findings and analyze QTLs for important traits leading to continued improvement of pepper for consumers. Data and information on the array are available through the Solanaceae Genomics Network.