PhosphoSitePlus, 2014: mutations, PTMs and recalibrationsPeter Hornbeck, Bin Zhang, Beth Murray et al.|Nucleic Acids Research|2014 PhosphoSitePlus(®) (PSP, http://www.phosphosite.org/), a knowledgebase dedicated to mammalian post-translational modifications (PTMs), contains over 330,000 non-redundant PTMs, including phospho, acetyl, ubiquityl and methyl groups. Over 95% of the sites are from mass spectrometry (MS) experiments. In order to improve data reliability, early MS data have been reanalyzed, applying a common standard of analysis across over 1,000,000 spectra. Site assignments with P > 0.05 were filtered out. Two new downloads are available from PSP. The 'Regulatory sites' dataset includes curated information about modification sites that regulate downstream cellular processes, molecular functions and protein-protein interactions. The 'PTMVar' dataset, an intersect of missense mutations and PTMs from PSP, identifies over 25,000 PTMVars (PTMs Impacted by Variants) that can rewire signaling pathways. The PTMVar data include missense mutations from UniPROTKB, TCGA and other sources that cause over 2000 diseases or syndromes (MIM) and polymorphisms, or are associated with hundreds of cancers. PTMVars include 18 548 phosphorlyation sites, 3412 ubiquitylation sites, 2316 acetylation sites, 685 methylation sites and 245 succinylation sites.
An atlas of substrate specificities for the human serine/threonine kinomeAbstract Protein phosphorylation is one of the most widespread post-translational modifications in biology 1,2 . With advances in mass-spectrometry-based phosphoproteomics, 90,000 sites of serine and threonine phosphorylation have so far been identified, and several thousand have been associated with human diseases and biological processes 3,4 . For the vast majority of phosphorylation events, it is not yet known which of the more than 300 protein serine/threonine (Ser/Thr) kinases encoded in the human genome are responsible 3 . Here we used synthetic peptide libraries to profile the substrate sequence specificity of 303 Ser/Thr kinases, comprising more than 84% of those predicted to be active in humans. Viewed in its entirety, the substrate specificity of the kinome was substantially more diverse than expected and was driven extensively by negative selectivity. We used our kinome-wide dataset to computationally annotate and identify the kinases capable of phosphorylating every reported phosphorylation site in the human Ser/Thr phosphoproteome. For the small minority of phosphosites for which the putative protein kinases involved have been previously reported, our predictions were in excellent agreement. When this approach was applied to examine the signalling response of tissues and cell lines to hormones, growth factors, targeted inhibitors and environmental or genetic perturbations, it revealed unexpected insights into pathway complexity and compensation. Overall, these studies reveal the intrinsic substrate specificity of the human Ser/Thr kinome, illuminate cellular signalling responses and provide a resource to link phosphorylation events to biological pathways.
PhosphoSite: A bioinformatics resource dedicated to physiological protein phosphorylationPhosphoSite is a curated, web-based bioinformatics resource dedicated to physiologic sites of protein phosphorylation in human and mouse. PhosphoSite is populated with information derived from published literature as well as high-throughput discovery programs. PhosphoSite provides information about the phosphorylated residue and its surrounding sequence, orthologous sites in other species, location of the site within known domains and motifs, and relevant literature references. Links are also provided to a number of external resources for protein sequences, structure, post-translational modifications and signaling pathways, as well as sources of phospho-specific antibodies and probes. As the amount of information in the underlying knowledgebase expands, users will be able to systematically search for the kinases, phosphatases, ligands, treatments, and receptors that have been shown to regulate the phosphorylation status of the sites, and pathways in which the phosphorylation sites function. As it develops into a comprehensive resource of known in vivo phosphorylation sites, we expect that PhosphoSite will be a valuable tool for researchers seeking to understand the role of intracellular signaling pathways in a wide variety of biological processes.
Three-photon imaging of mouse brain structure and function through the intact skullDisentangling molecular relationships with a causal inference testBACKGROUND: There has been intense effort over the past couple of decades to identify loci underlying quantitative traits as a key step in the process of elucidating the etiology of complex diseases. Recently there has been some effort to coalesce non-biased high-throughput data, e.g. high density genotyping and genome wide RNA expression, to drive understanding of the molecular basis of disease. However, a stumbling block has been the difficult question of how to leverage this information to identify molecular mechanisms that explain quantitative trait loci (QTL). We have developed a formal statistical hypothesis test, resulting in a p-value, to quantify uncertainty in a causal inference pertaining to a measured factor, e.g. a molecular species, which potentially mediates a known causal association between a locus and a quantitative trait. RESULTS: We treat the causal inference as a 'chain' of mathematical conditions that must be satisfied to conclude that the potential mediator is causal for the trait, where the inference is only as good as the weakest link in the chain. P-values are computed for the component conditions, which include tests of linkage and conditional independence. The Intersection-Union Test, in which a series of statistical tests are combined to form an omnibus test, is then employed to generate the overall test result. Using computer simulated mouse crosses, we show that type I error is low under a variety of conditions that include hidden variables and reactive pathways. We show that power under a simple causal model is comparable to other model selection techniques as well as Bayesian network reconstruction methods. We further show empirically that this method compares favorably to Bayesian network reconstruction methods for reconstructing transcriptional regulatory networks in yeast, recovering 7 out of 8 experimentally validated regulators. CONCLUSION: Here we propose a novel statistical framework in which existing notions of causal mediation are formalized into a hypothesis test, thus providing a standard quantitative measure of uncertainty in the form of a p-value. The method is theoretically and computationally accessible and with the provided software may prove a useful tool in disentangling molecular relationships.