Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomesWe have conducted a comprehensive search for conserved elements in vertebrate genomes, using genome-wide multiple alignments of five vertebrate species (human, mouse, rat, chicken, and Fugu rubripes). Parallel searches have been performed with multiple alignments of four insect species (three species of Drosophila and Anopheles gambiae), two species of Caenorhabditis, and seven species of Saccharomyces. Conserved elements were identified with a computer program called phastCons, which is based on a two-state phylogenetic hidden Markov model (phylo-HMM). PhastCons works by fitting a phylo-HMM to the data by maximum likelihood, subject to constraints designed to calibrate the model across species groups, and then predicting conserved elements based on this model. The predicted elements cover roughly 3%-8% of the human genome (depending on the details of the calibration procedure) and substantially higher fractions of the more compact Drosophila melanogaster (37%-53%), Caenorhabditis elegans (18%-37%), and Saccharaomyces cerevisiae (47%-68%) genomes. From yeasts to vertebrates, in order of increasing genome size and general biological complexity, increasing fractions of conserved bases are found to lie outside of the exons of known protein-coding genes. In all groups, the most highly conserved elements (HCEs), by log-odds score, are hundreds or thousands of bases long. These elements share certain properties with ultraconserved elements, but they tend to be longer and less perfectly conserved, and they overlap genes of somewhat different functional categories. In vertebrates, HCEs are associated with the 3' UTRs of regulatory genes, stable gene deserts, and megabase-sized regions rich in moderately conserved noncoding sequences. Noncoding HCEs also show strong statistical evidence of an enrichment for RNA secondary structure.
A high-resolution map of human evolutionary constraint using 29 mammalsThe comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome has undergone purifying selection, and locate constrained elements covering ∼4.2% of the genome. We use evolutionary signatures and comparisons with experimental data sets to suggest candidate functions for ∼60% of constrained bases. These elements reveal a small number of new coding exons, candidate stop codon readthrough events and over 10,000 regions of overlapping synonymous constraint within protein-coding exons. We find 220 candidate RNA structural families, and nearly a million elements overlapping potential promoter, enhancer and insulator regions. We report specific amino acid residues that have undergone positive selection, 280,000 non-coding elements exapted from mobile elements and more than 1,000 primate- and human-accelerated elements. Overlap with disease-associated variants indicates that our findings will be relevant for studies of human biology, health and disease. This comparative genomics study, comparing the complete human genome sequence with those of 29 placental mammals, including chimpanzees, mice and dogs, identifies 4.2% of the human genome as constrained by evolutionary selection, and ascribes a potential function to about 60% of these constrained bases. A series of evolutionary signatures emerges, providing insights into coding and non-coding functional genomic elements, candidate RNA structural families and aspects of genome organization and evolution. Overlap with disease-associated variants indicates that the findings will be relevant for studies of human disease.
An RNA gene expressed during cortical development evolved rapidly in humansAncient human genome sequence of an extinct Palaeo-EskimoWe report here the genome sequence of an ancient human. Obtained from ∼4,000-year-old permafrost-preserved hair, the genome represents a male individual from the first known culture to settle in Greenland. Sequenced to an average depth of 20×, we recover 79% of the diploid genome, an amount close to the practical limit of current sequencing technologies. We identify 353,151 high-confidence single-nucleotide polymorphisms (SNPs), of which 6.8% have not been reported previously. We estimate raw read contamination to be no higher than 0.8%. We use functional SNP assessment to assign possible phenotypic characteristics of the individual that belonged to a culture whose location has yielded only trace human remains. We compare the high-confidence SNPs to those of contemporary populations to find the populations most closely related to the individual. This provides evidence for a migration from Siberia into the New World some 5,500 years ago, independent of that giving rise to the modern Native Americans and Inuit. For the first time, the sequence of a near-complete nuclear genome has been obtained from the tissue of an ancient human. It comes from permafrost-preserved hair, about 4,000 years old, of a male palaeo-Eskimo of the Saqqaq culture, the earliest known settlers in Greenland. Functional single-nucleotide polymorphism (SNP) assessment was used to assign possible phenotypic characteristics. The analysis provides evidence for a migration from Siberia into the New World some 5,500 years ago, independent of the migration that gave rise to the modern Native Americans and Inuit. Elsewhere in the issue we profile the paper's last author Eske Willerslev, who headed the project and found the lock of hair in a Copenhagen museum basement — after a fruitless search among the archaeological sites of Peary Land. The first genome sequence of an ancient human is reported. It comes from an approximately 4,000-year-old permafrost-preserved hair from a male from the first known culture to settle in Greenland. Functional single-nucleotide polymorphism (SNP) assessment is used to assign possible phenotypic characteristics and high-confidence SNPs are compared to those of contemporary populations to find those most closely related to the individual.
Comprehensive Transcriptional Analysis of Early-Stage Urothelial Carcinoma