X

Xiaobei Zhou

SIB Swiss Institute of Bioinformatics

Publishes on Gene expression and cancer classification, Genetic Syndromes and Imprinting, Biomedical Text Mining and Ontologies. 26 papers and 785 citations.

26Publications
785Total Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

Robustly detecting differential expression in RNA sequencing data using observation weights
Xiaobei Zhou, Helen Lindsay, Mark D. Robinson|Nucleic Acids Research|2014
Cited by 420Open Access

A popular approach for comparing gene expression levels between (replicated) conditions of RNA sequencing data relies on counting reads that map to features of interest. Within such count-based methods, many flexible and advanced statistical approaches now exist and offer the ability to adjust for covariates (e.g. batch effects). Often, these methods include some sort of 'sharing of information' across features to improve inferences in small samples. It is important to achieve an appropriate tradeoff between statistical power and protection against outliers. Here, we study the robustness of existing approaches for count-based differential expression analysis and propose a new strategy based on observation weights that can be used within existing frameworks. The results suggest that outliers can have a global effect on differential analyses. We demonstrate the effectiveness of our new approach with real data and simulated data that reflects properties of real datasets (e.g. dispersion-mean trend) and develop an extensible framework for comprehensive testing of current and future methods. In addition, we explore the origin of such outliers, in some cases highlighting additional biological or technical factors within the experiment. Further details can be downloaded from the project website: http://imlspenticton.uzh.ch/robinson_lab/edgeR_robust/.

Statistical methods for detecting differentially methylated loci and regions
Mark D. Robinson, Abdullah Kahraman, Charity W. Law et al.|Frontiers in Genetics|2014
Cited by 132Open Access

DNA methylation, the reversible addition of methyl groups at CpG dinucleotides, represents an important regulatory layer associated with gene expression. Changed methylation status has been noted across diverse pathological states, including cancer. The rapid development and uptake of microarrays and large scale DNA sequencing has prompted an explosion of data analytic methods for processing and discovering changes in DNA methylation across varied data types. In this mini-review, we present a compact and accessible discussion of many of the salient challenges, such as experimental design, statistical methods for differential methylation detection, critical considerations such as cell type composition and the potential confounding that can arise from batch effects. From a statistical perspective, our main interests include the use of empirical Bayes or hierarchical models, which have proved immensely powerful in genomics, and the procedures by which false discovery control is achieved.

A probabilistic model for co-occurrence analysis in bibliometrics
Xiaobei Zhou, Miao Zhou, Desheng Huang et al.|Journal of Biomedical Informatics|2022
Cited by 87Open Access

The co-occurrence analysis of Medical Subject Heading (MeSH) terms extracted from the PubMed database is popularly used in bibliometrics. Practically for making the result interpretable, it is necessary to apply a certain filter procedure of co-occurrence matrix for removing the low-frequency items due to their low representativeness. Unfortunately, there is rare research referring to determine a critical threshold to remove the noise of co-occurrence matrix. Here, we proposed a probabilistic model for co-occurrence analysis that can provide statistical inferences about whether the paired items co-occur randomly. With help of this model, the dimensionality of co-occurrence matrix could be reduced according to the selected threshold. The conceptual model framework, simulation and practical applications are illustrated in the manuscript. Further details (including all reproducible codes) can be downloaded from the project website: https://github.com/xizhou/co-occurrence-analysis.git.

Common Features of Regulatory T Cell Specialization During Th1 Responses
Katharina Littringer, Claudia Moresi, Nikolas Rakebrandt et al.|Frontiers in Immunology|2018
Cited by 55Open Access

CD4+Foxp3+ Treg cells are essential for maintaining self-tolerance and preventing excessive immune responses. In the context of Th1 immune responses, co-expression of the Th1 transcription factor T-bet with Foxp3 is essential for Treg cells to control Th1 responses. T-bet-dependent expression of CXCR3 directs Treg cells to the site of inflammation. However, the suppressive mediators enabling effective control of Th1 responses at this site are unknown. In this study, we determined the signature of CXCR3+ Treg cells arising in Th1 settings and defined universal features of Treg cells in this context using multiple Th1-dominated infection models. Our analysis defined a set of Th1-specific co-inhibitory receptors and cytotoxic molecules that are specifically expressed in Treg cells during Th1 immune responses in mice and humans. Among these, we identified the novel co-inhibitory receptor CD85k as a functional predictor for Treg-mediated suppression specifically of Th1 responses, which could be explored therapeutically for selective immune suppression in autoimmunity.

miRNA-Seq normalization comparisons need improvement
Cited by 27Open Access

BACKGROUND Currently there is no method of best practice for the normalization of microRNA sequencing data (miRNA-Seq). Therefore, we read with interest a recent article in RNA by Garmire and Subramaniam that set out to compare various normalization strategies specifically for this application (Garmire and Subramaniam 2012). They compared methods currentlyinusefornormalizationofmessengerRNAsequencing (mRNA-Seq) data, such as total-depth normalization (“raw”) and Trimmed Mean of M-values (“TMM”). Additionally, they compared many methods not used previously with sequencing data, such as global scaling, and borrowed fromstrategies appliedtomicroarraystudies,such asquantile normalization (QN). The article attracted our attention for many reasons, but notably for the claimed poor performance and “abnormal results” of our TMM method (Robinson and Oshlack2010).Afterinvestigating,wediscoveredthatTMM’s claimedpoorperformancewastheresultofanerrorthatshiftedlog-ratiosinthewrongdirection.Furthermore,wefeltthat various practical issues were not satisfyingly discussed; we comment briefly on these here and provide reproducible reanalyses to support our claims (see Supplemental Material).