Predicting Drug-Target Interaction Networks Based on Functional Groups and Biological FeaturesBACKGROUND: Study of drug-target interaction networks is an important topic for drug development. It is both time-consuming and costly to determine compound-protein interactions or potential drug-target interactions by experiments alone. As a complement, the in silico prediction methods can provide us with very useful information in a timely manner. METHODS/PRINCIPAL FINDINGS: To realize this, drug compounds are encoded with functional groups and proteins encoded by biological features including biochemical and physicochemical properties. The optimal feature selection procedures are adopted by means of the mRMR (Maximum Relevance Minimum Redundancy) method. Instead of classifying the proteins as a whole family, target proteins are divided into four groups: enzymes, ion channels, G-protein- coupled receptors and nuclear receptors. Thus, four independent predictors are established using the Nearest Neighbor algorithm as their operation engine, with each to predict the interactions between drugs and one of the four protein groups. As a result, the overall success rates by the jackknife cross-validation tests achieved with the four predictors are 85.48%, 80.78%, 78.49%, and 85.66%, respectively. CONCLUSION/SIGNIFICANCE: Our results indicate that the network prediction system thus established is quite promising and encouraging.
Predicting Functions of Proteins in Mouse Based on Weighted Protein-Protein Interaction Network and Protein Hybrid PropertiesBACKGROUND: With the huge amount of uncharacterized protein sequences generated in the post-genomic age, it is highly desirable to develop effective computational methods for quickly and accurately predicting their functions. The information thus obtained would be very useful for both basic research and drug development in a timely manner. METHODOLOGY/PRINCIPAL FINDINGS: Although many efforts have been made in this regard, most of them were based on either sequence similarity or protein-protein interaction (PPI) information. However, the former often fails to work if a query protein has no or very little sequence similarity to any function-known proteins, while the latter had similar problem if the relevant PPI information is not available. In view of this, a new approach is proposed by hybridizing the PPI information and the biochemical/physicochemical features of protein sequences. The overall first-order success rates by the new predictor for the functions of mouse proteins on training set and test set were 69.1% and 70.2%, respectively, and the success rate covered by the results of the top-4 order from a total of 24 orders was 65.2%. CONCLUSIONS/SIGNIFICANCE: The results indicate that the new approach is quite promising that may open a new avenue or direction for addressing the difficult and complicated problem.
Analysis and Prediction of the Metabolic Stability of Proteins Based on Their Sequential Features, Subcellular Locations and Interaction NetworksThe metabolic stability is a very important idiosyncracy of proteins that is related to their global flexibility, intramolecular fluctuations, various internal dynamic processes, as well as many marvelous biological functions. Determination of protein's metabolic stability would provide us with useful information for in-depth understanding of the dynamic action mechanisms of proteins. Although several experimental methods have been developed to measure protein's metabolic stability, they are time-consuming and more expensive. Reported in this paper is a computational method, which is featured by (1) integrating various properties of proteins, such as biochemical and physicochemical properties, subcellular locations, network properties and protein complex property, (2) using the mRMR (Maximum Relevance & Minimum Redundancy) principle and the IFS (Incremental Feature Selection) procedure to optimize the prediction engine, and (3) being able to identify proteins among the four types: "short", "medium", "long", and "extra-long" half-life spans. It was revealed through our analysis that the following seven characters played major roles in determining the stability of proteins: (1) KEGG enrichment scores of the protein and its neighbors in network, (2) subcellular locations, (3) polarity, (4) amino acids composition, (5) hydrophobicity, (6) secondary structure propensity, and (7) the number of protein complexes the protein involved. It was observed that there was an intriguing correlation between the predicted metabolic stability of some proteins and the real half-life of the drugs designed to target them. These findings might provide useful insights for designing protein-stability-relevant drugs. The computational method can also be used as a large-scale tool for annotating the metabolic stability for the avalanche of protein sequences generated in the post-genomic age.
Prediction of lysine ubiquitination with mRMR feature selection and analysisYu‐Dong Cai, Tao Huang, Le‐Le Hu et al.|Amino Acids|2011 Removal of Hsf4 leads to cataract development in mice through down-regulation of γS-crystallin and Bfsp expressionXiaohe Shi, Bin Cui, Zhugang Wang et al.|BMC Molecular Biology|2009 BACKGROUND: Heat-shock transcription factor 4 (HSF4) mutations are associated with autosomal dominant lamellar cataract and Marner cataract. Disruptions of the Hsf4 gene cause lens defects in mice, indicating a requirement for HSF4 in fiber cell differentiation during lens development. However, neither the relationship between HSF4 and crystallins nor the detailed mechanism of maintenance of lens transparency by HSF4 is fully understood. RESULTS: In an attempt to determine how the underlying biomedical and physiological mechanisms resulting from loss of HSF4 contribute to cataract formation, we generated an Hsf4 knockout mouse model. We showed that the Hsf4 knockout mouse (Hsf4-/-) partially mimics the human cataract caused by HSF4 mutations. Q-PCR analysis revealed down-regulation of several cataract-relevant genes, including gamma S-crystallin (Crygs) and lens-specific beaded filament proteins 1 and 2 (Bfsp1 and Bfsp2), in the lens of the Hsf4-/- mouse. Transcription activity analysis using the dual-luciferase system suggested that these cataract-relevant genes are the direct downstream targets of HSF4. The effect of HSF4 on gamma S-crystallin is exemplified by the cataractogenesis seen in the Hsf4-/-,rncat intercross. The 2D electrophoretic analysis of whole-lens lysates revealed a different expression pattern in 8-week-old Hsf4-/- mice compared with their wild-type counterparts, including the loss of some alpha A-crystallin modifications and reduced expression of gamma-crystallin proteins. CONCLUSION: Our results indicate that HSF4 is sufficiently important to lens development and disruption of the Hsf4 gene leads to cataracts via at least three pathways: 1) down-regulation of gamma-crystallin, particularly gamma S-crystallin; 2) decreased lens beaded filament expression; and 3) loss of post-translational modification of alpha A-crystallin.