InterPro: the protein sequence classification resource in 2025InterPro (https://www.ebi.ac.uk/interpro) is a freely accessible resource for the classification of protein sequences into families. It integrates predictive models, known as signatures, from multiple member databases to classify sequences into families and predict the presence of domains and significant sites. The InterPro database provides annotations for over 200 million sequences, ensuring extensive coverage of UniProtKB, the standard repository of protein sequences, and includes mappings to several other major resources, such as Gene Ontology (GO), Protein Data Bank in Europe (PDBe) and the AlphaFold Protein Structure Database. In this publication, we report on the status of InterPro (version 101.0), detailing new developments in the database, associated web interface and software. Notable updates include the increased integration of structures predicted by AlphaFold and the enhanced description of protein families using artificial intelligence. Over the past two years, more than 5000 new InterPro entries have been created. The InterPro website now offers access to 85 000 protein families and domains from its member databases and serves as a long-term archive for retired databases. InterPro data, software and tools are freely available.
Using RINs to understand cancer mutations: deleterious mutations are more commonly associated to highly connected amino acidsLaise Cavalcanti Florentino|LA Referencia (Red Federada de Repositorios Institucionales de Publicaciones Científicas)|2018 In the last decades, advances in wholegenome sequencing research lead to the identification of a vast number of cancerrelated mutations. Achieving high performance in estimating the impacts of cancer mutations on protein structure is not an easy task, and most studies are limited to onebyone whole structural analysis. Moreover, there are still many challenges on the way to the precise and automated prediction of deleterious mutations. Therefore, understanding the structural impact of a particular amino acid change is hugely important for cancer medical research. However, most studies have been emphasizing sequences and structural modifications based on chemical characteristics of amino acids, not in fold features in which the conservation of noncovalent interactions play a significant role. Henceforth, in the present study, we used residue interaction networks (RINs) for largescale analysis of cancer missense mutations in order to infer their effects on the conservation of noncovalent interactions. We hypothesize that changes in highly connected amino acids are more likely to cause deleterious mutations. To evaluate this, we retrieved cancer missense mutations from COSMIC (cancer.sanger.ac.uk/cosmic) and TCGA (cancergenome.nih.gov) databases and mapped them to their respective structures retrieved from Protein Data Bank (rcsb.org). Then, RINs were constructed from the obtained pdb files, and network parameters such as the node's degree, edges' type, clustering coefficient, betweenness weighted were assessed and plotted using R scripts. Later, we compared these results against reported missense single nucleotide polymorphisms retrieved from dbSNP (www.ncbi.nlm.nih.gov/projects/SNP/) and to pathogenic and nonpathogenic cancer mutations from ClinVar (www.ncbi.nlm.nih.gov/clinvar/) databases. Our results demonstrate that the distribution of mutations per degree (node connectivity) varies significantly compared to random Monte Carlo simulations, tending to remain at nodes with lower connectivity. We also compare with the distribution of a set of human single nucleotide polymorphisms (SNPs). Besides, the proportion of deleterious mutations was significantly increased in nodes with a high degree of connectivity when two different criteria were used for their classification: proportions of software predictors (Ndamage) and clinical classification obtained from ClinVar. Considering these results, we can conclude that the changes in the highly connected amino acids are, in fact, more prone to generate deleterious mutations, due their higher proportion of occurrence in these nodes. Our results also indicate that the conservation of noncovalent interactions is an important parameter to consider in the evaluation of mutations effects and RINs analysis can be used as an additional parameter to aid in the prediction of deleterious mutations in cancer.