S

Simon Whelan

Uppsala University

Publishes on Genomics and Phylogenetic Studies, Genetic diversity and population structure, Evolution and Paleontology Studies. 56 papers and 19.6k citations.

56Publications
19.6kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

A General Empirical Model of Protein Evolution Derived from Multiple Protein Families Using a Maximum-Likelihood Approach
Simon Whelan, Nick Goldman|Molecular Biology and Evolution|2001
Cited by 3kOpen Access

Phylogenetic inference from amino acid sequence data uses mainly empirical models of amino acid replacement and is therefore dependent on those models. Two of the more widely used models, the Dayhoff and JTT models, are estimated using similar methods that can utilize large numbers of sequences from many unrelated protein families but are somewhat unsatisfactory because they rely on assumptions that may lead to systematic error and discard a large amount of the information within the sequences. The alternative method of maximum-likelihood estimation may utilize the information in the sequence data more efficiently and suffers from no systematic error, but it has previously been applicable to relatively few sequences related by a single phylogenetic tree. Here, we combine the best attributes of these two methods using an approximate maximum-likelihood method. We implemented this approach to estimate a new model of amino acid replacement from a database of globular protein sequences comprising 3,905 amino acid sequences split into 182 protein families. While the new model has an overall structure similar to those of other commonly used models, there are significant differences. The new model outperforms the Dayhoff and JTT models with respect to maximum-likelihood values for a large majority of the protein families in our database. This suggests that it provides a better overall fit to the evolutionary process in globular proteins and may lead to more accurate phylogenetic tree estimates. Potentially, this matrix, and the methods used to generate it, may also be useful in other areas of research, such as biological sequence database searching, sequence alignment, and protein structure prediction, for which an accurate description of amino acid replacement is required.

Protein Phylogenetic Analysis of Ca2+/cation Antiporters and Insights into their Evolution in Plants
Laura R. Emery, Simon Whelan, Kendal D. Hirschi et al.|Frontiers in Plant Science|2012
Cited by 944Open Access

Cation transport is a critical process in all organisms and is essential for mineral nutrition, ion stress tolerance, and signal transduction. Transporters that are members of the Ca(2+)/cation antiporter (CaCA) superfamily are involved in the transport of Ca(2+) and/or other cations using the counter exchange of another ion such as H(+) or Na(+). The CaCA superfamily has been previously divided into five transporter families: the YRBG, Na(+)/Ca(2+) exchanger (NCX), Na(+)/Ca(2+), K(+) exchanger (NCKX), H(+)/cation exchanger (CAX), and cation/Ca(2+) exchanger (CCX) families, which include the well-characterized NCX and CAX transporters. To examine the evolution of CaCA transporters within higher plants and the green plant lineage, CaCA genes were identified from the genomes of sequenced flowering plants, a bryophyte, lycophyte, and freshwater and marine algae, and compared with those from non-plant species. We found evidence of the expansion and increased diversity of flowering plant genes within the CAX and CCX families. Genes related to the NCX family are present in land plant though they encode distinct MHX homologs which probably have an altered transport function. In contrast, the NCX and NCKX genes which are absent in land plants have been retained in many species of algae, especially the marine algae, indicating that these organisms may share "animal-like" characteristics of Ca(2+) homeostasis and signaling. A group of genes encoding novel CAX-like proteins containing an EF-hand domain were identified from plants and selected algae but appeared to be lacking in any other species. Lack of functional data for most of the CaCA proteins make it impossible to reliably predict substrate specificity and function for many of the groups or individual proteins. The abundance and diversity of CaCA genes throughout all branches of life indicates the importance of this class of cation transporter, and that many transporters with novel functions are waiting to be discovered.

Covariation in Frequencies of Substitution, Deletion, Transposition, and Recombination During Eutherian Evolution
Ross C. Hardison, Krishna M. Roskin, Shan Yang et al.|Genome Research|2003
Cited by 290Open Access

Six measures of evolutionary change in the human genome were studied, three derived from the aligned human and mouse genomes in conjunction with the Mouse Genome Sequencing Consortium, consisting of (1) nucleotide substitution per fourfold degenerate site in coding regions, (2) nucleotide substitution per site in relics of transposable elements active only before the human-mouse speciation, and (3) the nonaligning fraction of human DNA that is nonrepetitive or in ancestral repeats; and three derived from human genome data alone, consisting of (4) SNP density, (5) frequency of insertion of transposable elements, and (6) rate of recombination. Features 1 and 2 are measures of nucleotide substitutions at two classes of "neutral" sites, whereas 4 is a measure of recent mutations. Feature 3 is a measure dominated by deletions in mouse, whereas 5 represents insertions in human. It was found that all six vary significantly in megabase-sized regions genome-wide, and many vary together. This indicates that some regions of a genome change slowly by all processes that alter DNA, and others change faster. Regional variation in all processes is correlated with, but not completely accounted for, by GC content in human and the difference between GC content in human and mouse.