Publishes on Minerals Flotation and Separation Techniques, Metallurgical Processes and Thermodynamics, Biofuel production and bioconversion. 101 papers and 29.4k citations.
Tremendous amount of RNA sequencing data have been produced by large consortium projects such as TCGA and GTEx, creating new opportunities for data mining and deeper understanding of gene functions. While certain existing web servers are valuable and widely used, many expression analysis functions needed by experimental biologists are still not adequately addressed by these tools. We introduce GEPIA (Gene Expression Profiling Interactive Analysis), a web-based tool to deliver fast and customizable functionalities based on TCGA and GTEx data. GEPIA provides key interactive and customizable functions including differential expression analysis, profiling plotting, correlation analysis, patient survival analysis, similar gene detection and dimensionality reduction analysis. The comprehensive expression analyses with simple clicking through GEPIA greatly facilitate data mining in wide research areas, scientific discussion and the therapeutic discovery process. GEPIA fills in the gap between cancer genomics big data and the delivery of integrated information to end users, thus helping unleash the value of the current data resources. GEPIA is available at http://gepia.cancer-pku.cn/.
Introduced in 2017, the GEPIA (Gene Expression Profiling Interactive Analysis) web server has been a valuable and highly cited resource for gene expression analysis based on tumor and normal samples from the TCGA and the GTEx databases. Here, we present GEPIA2, an updated and enhanced version to provide insights with higher resolution and more functionalities. Featuring 198 619 isoforms and 84 cancer subtypes, GEPIA2 has extended gene expression quantification from the gene level to the transcript level, and supports analysis of a specific cancer subtype, and comparison between subtypes. In addition, GEPIA2 has adopted new analysis techniques of gene signature quantification inspired by single-cell sequencing studies, and provides customized analysis where users can upload their own RNA-seq data and compare them with TCGA and GTEx samples. We also offer an API for batch process and easy retrieval of the analysis results. The updated web server is publicly accessible at http://gepia2.cancer-pku.cn/.
In 2017, we released GEPIA (Gene Expression Profiling Interactive Analysis) webserver to facilitate the widely used analyses based on the bulk gene expression datasets in the TCGA and the GTEx projects, providing the biologists and clinicians with a handy tool to perform comprehensive and complex data mining tasks. Recently, the deconvolution tools have led to revolutionary trends to resolve bulk RNA datasets at cell type-level resolution, interrogating the characteristics of different cell types in cancer and controlled cohorts became an important strategy to investigate the biological questions. Thus, we present GEPIA2021, a standalone extension of GEPIA, allowing users to perform multiple interactive analysis based on the deconvolution results, including cell type-level proportion comparison, correlation analysis, differential expression, and survival analysis. With GEPIA2021, experimental biologists could easily explore the large TCGA and GTEx datasets and validate their hypotheses in an enhanced resolution. GEPIA2021 is publicly accessible at http://gepia2021.cancer-pku.cn/.
Single-cell RNA sequencing (scRNA-seq) is a versatile tool for discovering and annotating cell types and states, but the determination and annotation of cell subtypes is often subjective and arbitrary. Often, it is not even clear whether a given cluster is uniform. Here we present an entropy-based statistic, ROGUE, to accurately quantify the purity of identified cell clusters. We demonstrate that our ROGUE metric is broadly applicable, and enables accurate, sensitive and robust assessment of cluster purity on a wide range of simulated and real datasets. Applying this metric to fibroblast, B cell and brain data, we identify additional subtypes and demonstrate the application of ROGUE-guided analyses to detect precise signals in specific subpopulations. ROGUE can be applied to all tested scRNA-seq datasets, and has important implications for evaluating the quality of putative clusters, discovering pure cell subtypes and constructing comprehensive, detailed and standardized single cell atlas.