Global alignment of multiple protein interaction networks with application to functional orthology detectionRohit Singh, Jinbo Xu, Bonnie Berger|Proceedings of the National Academy of Sciences|2008 Protein-protein interactions (PPIs) and their networks play a central role in all biological processes. Akin to the complete sequencing of genomes and their comparative analysis, complete descriptions of interactomes and their comparative analysis is fundamental to a deeper understanding of biological processes. A first step in such an analysis is to align two or more PPI networks. Here, we introduce an algorithm, IsoRank, for global alignment of multiple PPI networks. The guiding intuition here is that a protein in one PPI network is a good match for a protein in another network if their respective sequences and neighborhood topologies are a good match. We encode this intuition as an eigenvalue problem in a manner analogous to Google's PageRank method. Using IsoRank, we compute a global alignment of the Saccharomyces cerevisiae, Drosophila melanogaster, Caenorhabditis elegans, Mus musculus, and Homo sapiens PPI networks. We demonstrate that incorporating PPI data in ortholog prediction results in improvements over existing sequence-only approaches and over predictions from local alignments of the yeast and fly networks. Previous methods have been effective at identifying conserved, localized network patterns across pairs of networks. This work takes the further step of performing a global alignment of multiple PPI networks. It simultaneously uses sequence similarity and network data and, unlike previous approaches, explicitly models the tradeoff inherent in combining them. We expect IsoRank-with its simultaneous handling of node similarity and network similarity-to be applicable across many scientific domains.
Pyro: Deep Universal Probabilistic ProgrammingEli Bingham, Jonathan P. Chen, Martin Jankowiak et al.|arXiv (Cornell University)|2018 Pyro is a probabilistic programming language built on Python as a platform for developing advanced probabilistic models in AI research. To scale to large datasets and high-dimensional models, Pyro uses stochastic variational inference algorithms and probability distributions built on top of PyTorch, a modern GPU-accelerated deep learning framework. To accommodate complex or model-specific algorithmic behavior, Pyro leverages Poutine, a library of composable building blocks for modifying the behavior of probabilistic programs.
IsoRankN: spectral methods for global alignment of multiple protein networksMOTIVATION: With the increasing availability of large protein-protein interaction networks, the question of protein network alignment is becoming central to systems biology. Network alignment is further delineated into two sub-problems: local alignment, to find small conserved motifs across networks, and global alignment, which attempts to find a best mapping between all nodes of the two networks. In this article, our aim is to improve upon existing global alignment results. Better network alignment will enable, among other things, more accurate identification of functional orthologs across species. RESULTS: We introduce IsoRankN (IsoRank-Nibble) a global multiple-network alignment tool based on spectral clustering on the induced graph of pairwise alignment scores. IsoRankN outperforms existing algorithms for global network alignment in coverage and consistency on multiple alignments of the five available eukaryotic networks. Being based on spectral methods, IsoRankN is both error tolerant and computationally efficient. AVAILABILITY: Our software is available freely for non-commercial purposes on request from: http://isorank.csail.mit.edu/.
Pairwise Global Alignment of Protein Interaction Networks by Matching Neighborhood TopologyRohit Singh, Jinbo Xu, Bonnie Berger|Lecture notes in computer science|2007 D-SCRIPT translates genome to phenome with sequence-based, structure-aware, genome-scale predictions of protein-protein interactionsWe combine advances in neural language modeling and structurally motivated design to develop D-SCRIPT, an interpretable and generalizable deep-learning model, which predicts interaction between two proteins using only their sequence and maintains high accuracy with limited training data and across species. We show that a D-SCRIPT model trained on 38,345 human PPIs enables significantly improved functional characterization of fly proteins compared with the state-of-the-art approach. Evaluating the same D-SCRIPT model on protein complexes with known 3D structure, we find that the inter-protein contact map output by D-SCRIPT has significant overlap with the ground truth. We apply D-SCRIPT to screen for PPIs in cow (Bos taurus) at a genome-wide scale and focusing on rumen physiology, identify functional gene modules related to metabolism and immune response. The predicted interactions can then be leveraged for function prediction at scale, addressing the genome-to-phenome challenge, especially in species where little data are available.