R

Robert M. Waterhouse

SIB Swiss Institute of Bioinformatics

ORCID: 0000-0003-4199-9052

Publishes on Genomics and Phylogenetic Studies, Insect symbiosis and bacterial influences, Insect and Arachnid Ecology and Behavior. 271 papers and 34.2k citations.

271Publications
34.2kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs
Cited by 14.4kOpen Access

MOTIVATION: Genomics has revolutionized biological research, but quality assessment of the resulting assembled sequences is complicated and remains mostly limited to technical measures like N50. RESULTS: We propose a measure for quantitative assessment of genome assembly and annotation completeness based on evolutionarily informed expectations of gene content. We implemented the assessment procedure in open-source software, with sets of Benchmarking Universal Single-Copy Orthologs, named BUSCO. AVAILABILITY AND IMPLEMENTATION: Software implemented in Python and datasets available for download from http://busco.ezlab.org. CONTACT: evgeny.zdobnov@unige.ch SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

BUSCO Applications from Quality Assessments to Gene Prediction and Phylogenomics
Robert M. Waterhouse, Mathieu Seppey, Felipe A. Simão et al.|Molecular Biology and Evolution|2017
Cited by 2.5kOpen Access

Genomics promises comprehensive surveying of genomes and metagenomes, but rapidly changing technologies and expanding data volumes make evaluation of completeness a challenging task. Technical sequencing quality metrics can be complemented by quantifying completeness of genomic data sets in terms of the expected gene content of Benchmarking Universal Single-Copy Orthologs (BUSCO, http://busco.ezlab.org). The latest software release implements a complete refactoring of the code to make it more flexible and extendable to facilitate high-throughput assessments. The original six lineage assessment data sets have been updated with improved species sampling, 34 new subsets have been built for vertebrates, arthropods, fungi, and prokaryotes that greatly enhance resolution, and data sets are now also available for nematodes, protists, and plants. Here, we present BUSCO v3 with example analyses that highlight the wide-ranging utility of BUSCO assessments, which extend beyond quality control of genomics data sets to applications in comparative genomics analyses, gene predictor training, metagenomics, and phylogenomics.

Extensive introgression in a malaria vector species complex revealed by phylogenomics
Cited by 730

Introduction The notion that species boundaries can be porous to introgression is increasingly accepted. Yet the broader role of introgression in evolution remains contentious and poorly documented, partly because of the challenges involved in accurately identifying introgression in the very groups where it is most likely to occur. Recently diverged species often have incomplete reproductive barriers and may hybridize where they overlap. However, because of retention and stochastic sorting of ancestral polymorphisms, inference of the correct species branching order is notoriously challenging for recent speciation events, especially those closely spaced in time. Without knowledge of species relationships, it is impossible to identify instances of introgression. Rationale Since the discovery that the single mosquito taxon described in 1902 as Anopheles gambiae was actually a complex of several closely related and morphologically indistinguishable sibling species, the correct species branching order has remained controversial and unresolved. This Afrotropical complex contains the world’s most important vectors of human malaria, owing to their close association with humans, as well as minor vectors and species that do not bite humans. On the basis of ecology and behavior, one might predict phylogenetic clustering of the three highly anthropophilic vector species. However, previous phylogenetic analyses of the complex based on a limited number of markers strongly disagree about relationships between the major vectors, potentially because of historical introgression between them. To investigate the history of the species complex, we used whole-genome reference assemblies, as well as dozens of resequenced individuals from the field. Results We observed a large amount of phylogenetic discordance between trees generated from the autosomes and X chromosome. The autosomes, which make up the majority of the genome, overwhelmingly supported the grouping of the three major vectors of malaria, An. gambiae , An. coluzzii , and An. arabiensis . In stark contrast, the X chromosome strongly supported the grouping of An. arabiensis with a species that plays no role in malaria transmission, An. quadriannulatus . Although the whole-genome consensus phylogeny unequivocally agrees with the autosomal topology, we found that the topology most often located on the X chromosome follows the historical species branching order, with pervasive introgression on the autosomes producing relationships that group the three highly anthropophilic species together. With knowledge of the correct species branching order, we are further able to uncover introgression between another species pair, as well as a complex history of balancing selection, introgression, and local adaptation of a large autosomal inversion that confers aridity tolerance. Conclusion We identify the correct species branching order of the An. gambiae species complex, resolving a contentious phylogeny. Notably, lineages leading to the principal vectors of human malaria were among the first in the complex to radiate and are not most closely related to each other. Pervasive autosomal introgression between these human malaria vectors, including nonsister vector species, suggests that traits enhancing vectorial capacity can be acquired not only through de novo mutation but also through a more rapid process of interspecific genetic exchange. Time-lapse photographs of an adult anopheline mosquito emerging from its pupal case. RELATED ITEMS IN Science D. E. Neafsey et al ., Science 347 , 1258522 (2015)

Evolutionary Dynamics of Immune-Related Genes and Pathways in Disease-Vector Mosquitoes
Cited by 704Open Access

Mosquitoes are vectors of parasitic and viral diseases of immense importance for public health. The acquisition of the genome sequence of the yellow fever and Dengue vector, Aedes aegypti (Aa), has enabled a comparative phylogenomic analysis of the insect immune repertoire: in Aa, the malaria vector Anopheles gambiae (Ag), and the fruit fly Drosophila melanogaster (Dm). Analysis of immune signaling pathways and response modules reveals both conservative and rapidly evolving features associated with different functional gene categories and particular aspects of immune reactions. These dynamics reflect in part continuous readjustment between accommodation and rejection of pathogens and suggest how innate immunity may have evolved.

Highly evolvable malaria vectors: The genomes of 16 <i>Anopheles</i> mosquitoes
Cited by 615Open Access

INTRODUCTION Control of mosquito vectors has historically proven to be an effective means of eliminating malaria. Human malaria is transmitted only by mosquitoes in the genus Anopheles , but not all species within the genus, or even all members of each vector species, are efficient malaria vectors. Variation in vectorial capacity for human malaria among Anopheles mosquito species is determined by many factors, including behavior, immunity, and life history. RATIONALE This variation in vectorial capacity suggests an underlying genetic/genomic plasticity that results in variation of key traits determining vectorial capacity within the genus. Sequencing the genome of Anopheles gambiae , the most important malaria vector in sub-Saharan Africa, has offered numerous insights into how that species became highly specialized to live among and feed upon humans and how susceptibility to mosquito control strategies is determined. Until very recently, similar genomic resources have not existed for other anophelines, limiting comparisons to individual genes or sets of genomic markers with no genome-wide data to investigate attributes associated with vectorial capacity across the genus. RESULTS We sequenced and assembled the genomes and transcriptomes of 16 anophelines from Africa, Asia, Europe, and Latin America, spanning ~100 million years of evolution and chosen to represent a range of evolutionary distances from An. gambiae , a variety of geographic locations and ecological conditions, and varying degrees of vectorial capacity. Genome assembly quality reflected DNA template quality and homozygosity. Despite variation in contiguity, the assemblies were remarkably complete and searches for arthropod-wide single-copy orthologs generally revealed few missing genes. Genome annotation supported with RNA sequencing transcriptomes yielded between 10,738 and 16,149 protein-coding genes for each species. Relative to Drosophila, the closest dipteran genus for which equivalent genomic resources exist, Anopheles exhibits a dynamic genomic evolutionary profile. Comparative analyses show a fivefold faster rate of gene gain and loss, elevated gene shuffling on the X chromosome, and more intron losses in Anopheles . Some determinants of vectorial capacity, such as chemosensory genes, do not show elevated turnover but instead diversify through protein-sequence changes. We also document evidence of variation in important reproductive phenotypes, genes controlling immunity to Plasmodium malaria parasites and other microbes, genes encoding cuticular and salivary proteins, and genes conferring metabolic insecticide resistance. This dynamism of anopheline genes and genomes may contribute to their flexible capacity to take advantage of new ecological niches, including adapting to humans as primary hosts. CONCLUSIONS Anopheline mosquitoes exhibit a molecular evolutionary profile very distinct from Drosophila , and their genomes harbor strong evidence of functional variation in traits that determine vectorial capacity. These 16 new reference genome assemblies provide a foundation for hypothesis generation and testing to further our understanding of the diverse biological traits that determine vectorial capacity. Geography, vector status, and molecular phylogeny of the 16 newly sequenced anopheline mosquitoes and selected other dipterans. The maximum likelihood molecular phylogeny of all sequenced anophelines and two mosquito outgroups was constructed from the aligned protein sequences of 1085 single-copy orthologs. Shapes between branch termini and species names indicate vector status and are colored according to geographic ranges depicted on the map.