Publishes on Genetics, Bioinformatics, and Biomedical Research, Protein Structure and Dynamics, Bioinformatics and Genomic Networks. 64 papers and 16.3k citations.
Homology modelling has matured into an important technique in structural biology, significantly contributing to narrowing the gap between known protein sequences and experimentally determined structures. Fully automated workflows and servers simplify and streamline the homology modelling process, also allowing users without a specific computational expertise to generate reliable protein models and have easy access to modelling results, their visualization and interpretation. Here, we present an update to the SWISS-MODEL server, which pioneered the field of automated modelling 25 years ago and been continuously further developed. Recently, its functionality has been extended to the modelling of homo- and heteromeric complexes. Starting from the amino acid sequences of the interacting proteins, both the stoichiometry and the overall structure of the complex are inferred by homology modelling. Other major improvements include the implementation of a new modelling engine, ProMod3 and the introduction a new local model quality estimation method, QMEANDisCo. SWISS-MODEL is freely available at https://swissmodel.expasy.org.
SWISS-MODEL Repository (SMR) is a database of annotated 3D protein structure models generated by the automated SWISS-MODEL homology modeling pipeline. It currently holds >400 000 high quality models covering almost 20% of Swiss-Prot/UniProtKB entries. In this manuscript, we provide an update of features and functionalities which have been implemented recently. We address improvements in target coverage, model quality estimates, functional annotations and improved in-page visualization. We also introduce a new update concept which includes regular updates of an expanded set of core organism models and UniProtKB-based targets, complemented by user-driven on-demand update of individual models. With the new release of the modeling pipeline, SMR has implemented a REST-API and adopted an open licencing model for accessing model coordinates, thus enabling bulk download for groups of targets fostering re-use of models in other contexts. SMR can be accessed at https://swissmodel.expasy.org/repository.
PDBsum, http://www.ebi.ac.uk/pdbsum, is a website providing numerous pictorial analyses of each entry in the Protein Data Bank. It portrays the structural features of all proteins, DNA and ligands in the entry, as well as depicting the interactions between them. The latest features, described here, include annotation of human protein sequences with their naturally occurring amino acid variants, dynamic graphs showing the relationships between related protein domain architectures, analyses of ligand binding clusters across different experimental determinations of the same protein, analyses of tunnels in proteins and new search options.
Understanding which are the catalytic residues in an enzyme and what function they perform is crucial to many biology studies, particularly those leading to new therapeutics and enzyme design. The original version of the Catalytic Site Atlas (CSA) (http://www.ebi.ac.uk/thornton-srv/databases/CSA) published in 2004, which catalogs the residues involved in enzyme catalysis in experimentally determined protein structures, had only 177 curated entries and employed a simplistic approach to expanding these annotations to homologous enzyme structures. Here we present a new version of the CSA (CSA 2.0), which greatly expands the number of both curated (968) and automatically annotated catalytic sites in enzyme structures, utilizing a new method for annotation transfer. The curated entries are used, along with the variation in residue type from the sequence comparison, to generate 3D templates of the catalytic sites, which in turn can be used to find catalytic sites in new structures. To ease the transfer of CSA annotations to other resources a new ontology has been developed: the Enzyme Mechanism Ontology, which has permitted the transfer of annotations to Mechanism, Annotation and Classification in Enzymes (MACiE) and UniProt Knowledge Base (UniProtKB) resources. The CSA database schema has been re-designed and both the CSA data and search capabilities are presented in a new modern web interface.
Identifying when past exposure to an infectious disease will protect against newly emerging strains is central to understanding the spread and the severity of epidemics, but the prediction of viral cross-protection remains an important unsolved problem. For foot-and-mouth disease virus (FMDV) research in particular, improved methods for predicting this cross-protection are critical for predicting the severity of outbreaks within endemic settings where multiple serotypes and subtypes commonly co-circulate, as well as for deciding whether appropriate vaccine(s) exist and how much they could mitigate the effects of any outbreak. To identify antigenic relationships and their predictors, we used linear mixed effects models to account for variation in pairwise cross-neutralization titres using only viral sequences and structural data. We identified those substitutions in surface-exposed structural proteins that are correlates of loss of cross-reactivity. These allowed prediction of both the best vaccine match for any single virus and the breadth of coverage of new vaccine candidates from their capsid sequences as effectively as or better than serology. Sub-sequences chosen by the model-building process all contained sites that are known epitopes on other serotypes. Furthermore, for the SAT1 serotype, for which epitopes have never previously been identified, we provide strong evidence--by controlling for phylogenetic structure--for the presence of three epitopes across a panel of viruses and quantify the relative significance of some individual residues in determining cross-neutralization. Identifying and quantifying the importance of sites that predict viral strain cross-reactivity not just for single viruses but across entire serotypes can help in the design of vaccines with better targeting and broader coverage. These techniques can be generalized to any infectious agents where cross-reactivity assays have been carried out. As the parameterization uses pre-existing datasets, this approach quickly and cheaply increases both our understanding of antigenic relationships and our power to control disease.