CellMarker: a manually curated resource of cell markers in human and mouseXinxin Zhang, Yujia Lan, Jinyuan Xu et al.|Nucleic Acids Research|2018 One of the most fundamental questions in biology is what types of cells form different tissues and organs in a functionally coordinated fashion. Larger-scale single-cell sequencing and biology experiment studies are now rapidly opening up new ways to track this question by revealing substantial cell markers for distinguishing different cell types in tissues. Here, we developed the CellMarker database (http://biocc.hrbmu.edu.cn/CellMarker/ or http://bio-bigdata.hrbmu.edu.cn/CellMarker/), aiming to provide a comprehensive and accurate resource of cell markers for various cell types in tissues of human and mouse. By manually curating over 100 000 published papers, 4124 entries including the cell marker information, tissue type, cell type, cancer information and source, were recorded. At last, 13 605 cell markers of 467 cell types in 158 human tissues/sub-tissues and 9148 cell makers of 389 cell types in 81 mouse tissues/sub-tissues were collected and deposited in CellMarker. CellMarker provides a user-friendly interface for browsing, searching and downloading markers of diverse cell types of different tissues. Furthermore, a summarized marker prevalence in each cell type is graphically and intuitively presented through a vivid statistical graph. We believe that CellMarker is a comprehensive and valuable resource for cell researches in precisely identifying and characterizing cells, especially at the single-cell level.
A comprehensive overview of lncRNA annotation resourcesJinyuan Xu, Jing Bai, Xinxin Zhang et al.|Briefings in Bioinformatics|2016 Long noncoding RNAs (lncRNAs) are emerging as a class of important regulators participating in various biological functions and disease processes. With the widespread application of next-generation sequencing technologies, large numbers of lncRNAs have been identified, producing plenty of lncRNA annotation resources in different contexts. However, at present, we lack a comprehensive overview of these lncRNA annotation resources. In this study, we reviewed 24 currently available lncRNA annotation resources referring to > 205 000 lncRNAs in over 50 tissues and cell lines. We characterized these annotation resources from different aspects, including exon structure, expression, histone modification and function. We found many distinct properties among these annotation resources. Especially, these resources showed diverse chromatin signatures, remarkable tissue and cell type dependence and functional specificity. Our results suggested the incompleteness and complementarity of current lncRNA annotations and the necessity of integration of multiple resources to comprehensively characterize lncRNAs. Finally, we developed 'LNCat' (lncRNA atlas, freely available at http://biocc.hrbmu.edu.cn/LNCat/), a user-friendly database that provides a genome browser of lncRNA structures, visualization of different resources from multiple angles and download of different combinations of lncRNA annotations, and supports rapid exploration, comparison and integration of lncRNA annotation resources. Overall, our study provides a comprehensive comparison of numerous lncRNA annotations, and can facilitate understanding of lncRNAs in human disease.
Breast cancer prognosis signature: linking risk stratification to disease subtypesFulong Yu, Fei Quan, Jinyuan Xu et al.|Briefings in Bioinformatics|2018 Breast cancer is a very complex and heterogeneous disease with variable molecular mechanisms of carcinogenesis and clinical behaviors. The identification of prognostic risk factors may enable effective diagnosis and treatment of breast cancer. In particular, numerous gene-expression-based prognostic signatures were developed and some of them have already been applied into clinical trials and practice. In this study, we summarized several representative gene-expression-based signatures with significant prognostic value and separately assessed their ability of prognosis prediction in their originally targeted populations of breast cancer. Notably, many of the collected signatures were originally designed to predict the outcomes of estrogen receptor positive (ER+) patients or the whole breast cancer cohort; there are no typical signatures used for the prognostic prediction in a specific population of patients with the intrinsic subtype. We thus attempted to identify subtype-specific prognostic signatures via a computational framework for analyzing multi-omics profiles and patient survival. For both the discovery and an independent data set, we confirmed that subtype-specific signature is a strong and significant independent prognostic factor in the corresponding cohort. These results indicate that the subtype-specific prognostic signature has a much higher resolution in the risk stratification, which may lead to improved therapies and precision medicine for patients with breast cancer.
Identification of cancer-related lncRNAs through integrating genome, regulome and transcriptome featuresTingting Zhao, Jinyuan Xu, Ling Liu et al.|Molecular BioSystems|2014 Abstract LncRNAs have become rising stars in biology and medicine, due to their versatile functions in a wide range of important biological processes and active roles in various human cancers. Here, we developed a computational method based on the naïve Bayesian classifier method to identify cancer-related lncRNAs by integrating genome, regulome and transcriptome data, and identified 707 potential cancer-related lncRNAs. We demonstrated the performance of the method by ten-fold cross-validation, and found that integration of multi-omic data was necessary to identify cancer-related lncRNAs. We identified 707 potential cancer-related lncRNAs and our results showed that these lncRNAs tend to exhibit significant differential expression and differential DNA methylation in multiple cancer types, and prognosis effects in prostate cancer. We also found that these lncRNAs were more likely to be direct targets of TP53 family members than others. Moreover, based on 147 lncRNA knockdown data in mice, we validated that four of six mouse orthologous lncRNAs were significantly involved in many cancer-related processes, such as cell differentiation and the Wnt signaling pathway. Notably, one lncRNA, lnc-SNURF-1, which was found to be associated with TNF-mediated signaling pathways, was up-regulated in prostate cancer and the protein-coding genes affected by knockdown of the lncRNA were also significantly aberrant in prostate cancer patients, suggesting its probable importance in tumorigenesis. Taken together, our method underlines the power of integrating multi-omic data to uncover cancer-related lncRNAs.
Sex difference of mutation clonality in diffuse glioma evolutionBACKGROUND: Sex differences in glioma incidence and outcome have been previously reported but remain poorly understood. Many sex differences that affect the cancer risk were thought to be associated with cancer evolution. METHODS: In this study, we used an integrated framework to infer the timing and clonal status of mutations in ~600 diffuse gliomas from The Cancer Genome Atlas (TCGA) including glioblastomas (GBMs) and low-grade gliomas (LGGs), and investigated the sex difference of mutation clonality. RESULTS: We observed higher overall and subclonal mutation burden in female patients with different grades of gliomas, which could be largely explained by the mutations of the X chromosome. Some well-established drivers were identified showing sex-biased clonality, such as CDH18 and ATRX. Focusing on glioma subtypes, we further found a higher subclonal mutation burden in females than males in the majority of glioma subtypes, and observed opposite clonal tendency of several drivers between male and female patients in a specific subtype. Moreover, analysis of clinically actionable genes revealed that mutations in genes of the mitogen-activated protein kinase (MAPK) signaling pathway were more likely to be clonal in female patients with GBM, whereas mutations in genes involved in the receptor tyrosine kinase signaling pathway were more likely to be clonal in male patients with LGG. CONCLUSIONS: The patients with diffuse glioma showed sex-biased mutation clonality (eg, different subclonal mutation number and different clonal tendency of cancer genes), highlighting the need to consider sex as an important variable for improving glioma therapy and clinical care.