Zhengwei Zhu

CD-HIT: accelerated for clustering the next-generation sequencing data

LiMin Fu, Beifang Niu, Zhengwei Zhu et al.|Bioinformatics|2012

Cited by 11.7kOpen Access

SUMMARY: CD-HIT is a widely used program for clustering biological sequences to reduce sequence redundancy and improve the performance of other sequence analyses. In response to the rapid increase in the amount of sequencing data produced by the next-generation sequencing technologies, we have developed a new CD-HIT program accelerated with a novel parallelization strategy and some other techniques to allow efficient clustering of such datasets. Our tests demonstrated very good speedup derived from the parallelization for up to ∼24 cores and a quasi-linear speedup for up to ∼8 cores. The enhanced CD-HIT is capable of handling very large datasets in much shorter time than previous versions. AVAILABILITY: http://cd-hit.org. CONTACT: liwz@sdsc.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Senataxin Mutation Reveals How R-Loops Promote Transcription by Blocking DNA Methylation at Gene Promoters

Christopher Grunseich, Isabel X. Wang, Jason A. Watts et al.|Molecular Cell|2018

Cited by 255Open Access

Human proteins that interact with RNA/DNA hybrids

Isabel X. Wang, Christopher Grunseich, Jennifer E. Fox et al.|Genome Research|2018

Cited by 220Open Access

RNA/DNA hybrids form when RNA hybridizes with its template DNA generating a three-stranded structure known as the R-loop. Knowledge of how they form and resolve, as well as their functional roles, is limited. Here, by pull-down assays followed by mass spectrometry, we identified 803 proteins that bind to RNA/DNA hybrids. Because these proteins were identified using in vitro assays, we confirmed that they bind to R-loops in vivo. They include proteins that are involved in a variety of functions, including most steps of RNA processing. The proteins are enriched for K homology (KH) and helicase domains. Among them, more than 300 proteins preferred binding to hybrids than double-stranded DNA. These proteins serve as starting points for mechanistic studies to elucidate what RNA/DNA hybrids regulate and how they are regulated.

Templates are available to model nearly all complexes of structurally characterized proteins

Petras J. Kundrotas, Zhengwei Zhu, Joël Janin et al.|Proceedings of the National Academy of Sciences|2012

Cited by 192Open Access

Traditional approaches to protein-protein docking sample the binding modes with no regard to similar experimentally determined structures (templates) of protein-protein complexes. Emerging template-based docking approaches utilize such similar complexes to determine the docking predictions. The docking problem assumes the knowledge of the participating proteins' structures. Thus, it provides the possibility of aligning the structures of the proteins and the template complexes. The progress in the development of template-based docking and the vast experience in template-based modeling of individual proteins show that, generally, such approaches are more reliable than the free modeling. The key aspect of this modeling paradigm is the availability of the templates. The current common perception is that due to the difficulties in experimental structure determination of protein-protein complexes, the pool of docking templates is insignificant, and thus a broad application of template-based docking is possible only at some future time. The results of our large scale, systematic study show that, surprisingly, in spite of the limited number of protein-protein complexes in the Protein Data Bank, docking templates can be found for complexes representing almost all the known protein-protein interactions, provided the components themselves have a known structure or can be homology-built. About one-third of the templates are of good quality when they are compared to experimental structures in test sets extracted from the Protein Data Bank and would be useful starting points in modeling the complexes. This finding dramatically expands our ability to model protein interactions, and has far-reaching implications for the protein docking field in general.

Biodegradation of sulfamethoxazole by Phanerochaete chrysosporium

Xiali Guo, Zhengwei Zhu, Hongli Li|Journal of Molecular Liquids|2014

Cited by 56

Is this you? Claim your profile.

Top publicationsby citations