S

Stephen M. J. Searle

Wellcome Sanger Institute

Publishes on Genomics and Phylogenetic Studies, Chromosomal and Genetic Variations, RNA and protein synthesis mechanisms. 54 papers and 36k citations.

54Publications
36kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

GENCODE: The reference human genome annotation for The ENCODE Project
Jennifer Harrow, Adam Frankish, José M. González et al.|Genome Research|2012
Cited by 5kOpen Access

The GENCODE Consortium aims to identify all gene features in the human genome using a combination of computational analysis, manual annotation, and experimental validation. Since the first public release of this annotation data set, few new protein-coding loci have been added, yet the number of alternative splicing transcripts annotated has steadily increased. The GENCODE 7 release contains 20,687 protein-coding and 9640 long noncoding RNA loci and has 33,977 coding transcripts not represented in UCSC genes and RefSeq. It also has the most comprehensive annotation of long noncoding RNA (lncRNA) loci publicly available with the predominant transcript form consisting of two exons. We have examined the completeness of the transcript annotation and found that 35% of transcriptional start sites are supported by CAGE clusters and 62% of protein-coding genes have annotated polyA sites. Over one-third of GENCODE protein-coding genes are supported by peptide hits derived from mass spectrometry spectra submitted to Peptide Atlas. New models derived from the Illumina Body Map 2.0 RNA-seq data identify 3689 new loci not currently in GENCODE, of which 3127 consist of two exon models indicating that they are possibly unannotated long noncoding loci. GENCODE 7 is publicly available from gencodegenes.org and via the Ensembl and UCSC Genome Browsers.

The genomic basis of adaptive evolution in threespine sticklebacks
Cited by 1.9kOpen Access

Marine stickleback fish have colonized and adapted to thousands of streams and lakes formed since the last ice age, providing an exceptional opportunity to characterize genomic mechanisms underlying repeated ecological adaptation in nature. Here we develop a high-quality reference genome assembly for threespine sticklebacks. By sequencing the genomes of twenty additional individuals from a global set of marine and freshwater populations, we identify a genome-wide set of loci that are consistently associated with marine–freshwater divergence. Our results indicate that reuse of globally shared standing genetic variation, including chromosomal inversions, has an important role in repeated evolution of distinct marine and freshwater sticklebacks, and in the maintenance of divergent ecotypes during early stages of reproductive isolation. Both coding and regulatory changes occur in the set of loci underlying marine–freshwater evolution, but regulatory changes appear to predominate in this well known example of repeated adaptive evolution in nature. A reference genome sequence for threespine sticklebacks, and re-sequencing of 20 additional world-wide populations, reveals loci used repeatedly during vertebrate evolution; multiple chromosome inversions contribute to marine-freshwater divergence, and regulatory variants predominate over coding variants in this classic example of adaptive evolution in natural environments. Threespine sticklebacks have become a powerful model for studying the molecular basis of adaptive evolution. This paper presents a high-quality reference genome sequence, along with genomes of 20 further individuals from a global set of marine and freshwater populations. Genomic analysis reveals that reuse of globally shared standing genetic variation plays an important part in repeated evolution of distinct stickleback populations, and in the maintenance of divergent ecotypes during early stages of reproductive isolation. The data are consistent with an important role for regulatory changes during parallel evolution of marine and freshwater sticklebacks.

The Jalview Java alignment editor
Michèle Clamp, James Cuff, Stephen M. J. Searle et al.|Bioinformatics|2004
Cited by 1.7kOpen Access

Abstract Summary: Multiple sequence alignment remains a crucial method for understanding the function of groups of related nucleic acid and protein sequences. However, it is known that automatic multiple sequence alignments can often be improved by manual editing. Therefore, tools are needed to view and edit multiple sequence alignments. Due to growth in the sequence databases, multiple sequence alignments can often be large and difficult to view efficiently. The Jalview Java alignment editor is presented here, which enables fast viewing and editing of large multiple sequence alignments. Availability: The Jar file and source code for Jalview is freely available via the World Wide Web at http://www.jalview.org. A Jalview mailing list is also available by e-mailing majordomo@sanger.ac.uk with subscribe Jalview in the body of the mail.

The Genome Sequence of Taurine Cattle: A Window to Ruminant Biology and Evolution
Cited by 1.3kOpen Access

To understand the biology and evolution of ruminants, the cattle genome was sequenced to about sevenfold coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1217 are absent or undetected in noneutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides a resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production.