University of Michigan
ORCID: 0000-0003-1738-0591Publishes on Protein Structure and Dynamics, Computational Drug Discovery Methods, RNA and protein synthesis mechanisms. 26 papers and 15.7k citations.
Add your photo, update your bio, and get notified when your ranking changes.
MOTIVATION: Identification of protein-ligand binding sites is critical to protein function annotation and drug discovery. However, there is no method that could generate optimal binding site prediction for different protein types. Combination of complementary predictions is probably the most reliable solution to the problem. RESULTS: We develop two new methods, one based on binding-specific substructure comparison (TM-SITE) and another on sequence profile alignment (S-SITE), for complementary binding site predictions. The methods are tested on a set of 500 non-redundant proteins harboring 814 natural, drug-like and metal ion molecules. Starting from low-resolution protein structure predictions, the methods successfully recognize >51% of binding residues with average Matthews correlation coefficient (MCC) significantly higher (with P-value <10(-9) in student t-test) than other state-of-the-art methods, including COFACTOR, FINDSITE and ConCavity. When combining TM-SITE and S-SITE with other structure-based programs, a consensus approach (COACH) can increase MCC by 15% over the best individual predictions. COACH was examined in the recent community-wide COMEO experiment and consistently ranked as the best method in last 22 individual datasets with the Area Under the Curve score 22.5% higher than the second best method. These data demonstrate a new robust approach to protein-ligand binding site recognition, which is ready for genome-wide structure-based function annotations. AVAILABILITY: http://zhanglab.ccmb.med.umich.edu/COACH/
BioLiP (http://zhanglab.ccmb.med.umich.edu/BioLiP/) is a semi-manually curated database for biologically relevant ligand-protein interactions. Establishing interactions between protein and biologically relevant ligands is an important step toward understanding the protein functions. Most ligand-binding sites prediction methods use the protein structures from the Protein Data Bank (PDB) as templates. However, not all ligands present in the PDB are biologically relevant, as small molecules are often used as additives for solving the protein structures. To facilitate template-based ligand-protein docking, virtual ligand screening and protein function annotations, we develop a hierarchical procedure for assessing the biological relevance of ligands present in the PDB structures, which involves a four-step biological feature filtering followed by careful manual verifications. This procedure is used for BioLiP construction. Each entry in BioLiP contains annotations on: ligand-binding residues, ligand-binding affinity, catalytic sites, Enzyme Commission numbers, Gene Ontology terms and cross-links to the other databases. In addition, to facilitate the use of BioLiP for function annotation of uncharacterized proteins, a new consensus-based algorithm COACH is developed to predict ligand-binding sites from protein sequence or using 3D structure. The BioLiP database is updated weekly and the current release contains 204 223 entries.
We have developed a new COFACTOR webserver for automated structure-based protein function annotation. Starting from a structural model, given by either experimental determination or computational modeling, COFACTOR first identifies template proteins of similar folds and functional sites by threading the target structure through three representative template libraries that have known protein-ligand binding interactions, Enzyme Commission number or Gene Ontology terms. The biological function insights in these three aspects are then deduced from the functional templates, the confidence of which is evaluated by a scoring function that combines both global and local structural similarities. The algorithm has been extensively benchmarked by large-scale benchmarking tests and demonstrated significant advantages compared to traditional sequence-based methods. In the recent community-wide CASP9 experiment, COFACTOR was ranked as the best method for protein-ligand binding site predictions. The COFACTOR sever and the template libraries are freely available at http://zhanglab.ccmb.med.umich.edu/COFACTOR.