Fuzhou University
ORCID: 0000-0003-2285-8203Publishes on Machine Learning in Bioinformatics, Genomics and Phylogenetic Studies, RNA and protein synthesis mechanisms. 41 papers and 6.6k citations.
Add your photo, update your bio, and get notified when your ranking changes.
Protein sequence alignment is essential for template-based protein structure prediction and function annotation. We collect 20 sequence alignment algorithms, 10 published and 10 newly developed, which cover all representative sequence- and profile-based alignment approaches. These algorithms are benchmarked on 538 non-redundant proteins for protein fold-recognition on a uniform template library. Results demonstrate dominant advantage of profile-profile based methods, which generate models with average TM-score 26.5% higher than sequence-profile methods and 49.8% higher than sequence-sequence alignment methods. There is no obvious difference in results between methods with profiles generated from PSI-BLAST PSSM matrix and hidden Markov models. Accuracy of profile-profile alignments can be further improved by 9.6% or 21.4% when predicted or native structure features are incorporated. Nevertheless, TM-scores from profile-profile methods including experimental structural features are still 37.1% lower than that from TM-align, demonstrating that the fold-recognition problem cannot be solved solely by improving accuracy of structure feature predictions.
A novel esterase gene (estSL3) was cloned from the Alkalibacterium sp. SL3, which was isolated from the sediment of soda lake Dabusu. The 636-bp full-length gene encodes a polypeptide of 211 amino acid residues that is closely related with putative GDSL family lipases from Alkalibacterium and Enterococcus. The gene was successfully expressed in E. coli, and the recombinant protein (rEstSL3) was purified to electrophoretic homogeneity and characterized. rEstSL3 exhibited the highest activity towards pNP-acetate and had no activity towards pNP-esters with acyl chains longer than C8. The enzyme was highly cold-adapted, showing an apparent temperature optimum of 30 °C and remaining approximately 70% of the activity at 0 °C. It was active and stable over the pH range from 7 to 10, and highly salt-tolerant up to 5 M NaCl. Moreover, rEstSL3 was strongly resistant to most tested metal ions, chemical reagents, detergents and organic solvents. Amino acid composition analysis indicated that EstSL3 had fewer proline residues, hydrogen bonds and salt bridges than mesophilic and thermophilic counterparts, but more acidic amino acids and less hydrophobic amino acids when compared with other salt-tolerant esterases. The cold active, salt-tolerant and chemical-resistant properties make it a promising enzyme for basic research and industrial applications.
Protein S-sulfenylation (SOH) is a type of post-translational modification through the oxidation of cysteine thiols to sulfenic acids. It acts as a redox switch to modulate versatile cellular processes and plays important roles in signal transduction, protein folding and enzymatic catalysis. Reversible SOH is also a key component for maintaining redox homeostasis and has been implicated in a variety of human diseases, such as cancer, diabetes, and atherosclerosis, due to redox imbalance. Despite its significance, the in situ trapping of the entire 'sulfenome' remains a major challenge. Yang et al. have recently experimentally identified about 1000 SOH sites, providing an enriched benchmark SOH dataset. In this work, we developed a new ensemble learning tool SOHPRED for identifying protein SOH sites based on the compositions of enriched amino acids and the physicochemical properties of residues surrounding SOH sites. SOHPRED was built based on four complementary predictors, i.e. a naive Bayesian predictor, a random forest predictor and two support vector machine predictors, whose training features are, respectively, amino acid occurrences, physicochemical properties, frequencies of k-spaced amino acid pairs and sequence profiles. Benchmarking experiments on the 5-fold cross validation and independent tests show that SOHPRED achieved AUC values of 0.784 and 0.799, respectively, which outperforms several previously developed tools. As a real application of SOHPRED, we predicted potential SOH sites for 193 S-sulfenylated substrates, which had been experimentally detected through a global sulfenome profiling in living cells, though the actual SOH sites were not determined. The web server of SOHPRED has been made publicly available at for the wider research community. The source codes and the benchmark datasets can be downloaded from the website.