Spatially informed clustering, integration, and deconvolution of spatial transcriptomics with GraphSTYahui Long, Kok Siong Ang, Mengwei Li et al.|Nature Communications|2023 Spatial transcriptomics technologies generate gene expression profiles with spatial context, requiring spatially informed analysis tools for three key tasks, spatial clustering, multisample integration, and cell-type deconvolution. We present GraphST, a graph self-supervised contrastive learning method that fully exploits spatial transcriptomics data to outperform existing methods. It combines graph neural networks with self-supervised contrastive learning to learn informative and discriminative spot representations by minimizing the embedding distance between spatially adjacent spots and vice versa. We demonstrated GraphST on multiple tissue types and technology platforms. GraphST achieved 10% higher clustering accuracy and better delineated fine-grained tissue structures in brain and embryo tissues. GraphST is also the only method that can jointly analyze multiple tissue slices in vertical or horizontal integration while correcting batch effects. Lastly, GraphST demonstrated superior cell-type deconvolution to capture spatial niches like lymph node germinal centers and exhausted tumor infiltrating T cells in breast tumor tissue.
Unsupervised spatially embedded deep representation of spatial transcriptomicsHang Xu, Huazhu Fu, Yahui Long et al.|Genome Medicine|2024 Optimal integration of transcriptomics data and associated spatial information is essential towards fully exploiting spatial transcriptomics to dissect tissue heterogeneity and map out inter-cellular communications. We present SEDR, which uses a deep autoencoder coupled with a masked self-supervised learning mechanism to construct a low-dimensional latent representation of gene expression, which is then simultaneously embedded with the corresponding spatial information through a variational graph autoencoder. SEDR achieved higher clustering performance on manually annotated 10 × Visium datasets and better scalability on high-resolution spatial transcriptomics datasets than existing methods. Additionally, we show SEDR's ability to impute and denoise gene expression (URL: https://github.com/JinmiaoChenLab/SEDR/ ).
Predicting human microbe–drug associations via graph convolutional network with conditional random fieldYahui Long, Min Wu, Chee Keong Kwoh et al.|Bioinformatics|2020 MOTIVATION: Human microbes play critical roles in drug development and precision medicine. How to systematically understand the complex interaction mechanism between human microbes and drugs remains a challenge nowadays. Identifying microbe-drug associations can not only provide great insights into understanding the mechanism, but also boost the development of drug discovery and repurposing. Considering the high cost and risk of biological experiments, the computational approach is an alternative choice. However, at present, few computational approaches have been developed to tackle this task. RESULTS: In this work, we leveraged rich biological information to construct a heterogeneous network for drugs and microbes, including a microbe similarity network, a drug similarity network and a microbe-drug interaction network. We then proposed a novel graph convolutional network (GCN)-based framework for predicting human Microbe-Drug Associations, named GCNMDA. In the hidden layer of GCN, we further exploited the Conditional Random Field (CRF), which can ensure that similar nodes (i.e. microbes or drugs) have similar representations. To more accurately aggregate representations of neighborhoods, an attention mechanism was designed in the CRF layer. Moreover, we performed a random walk with restart-based scheme on both drug and microbe similarity networks to learn valuable features for drugs and microbes, respectively. Experimental results on three different datasets showed that our GCNMDA model consistently achieved better performance than seven state-of-the-art methods. Case studies for three microbes including SARS-CoV-2 and two antimicrobial drugs (i.e. Ciprofloxacin and Moxifloxacin) further confirmed the effectiveness of GCNMDA in identifying potential microbe-drug associations. AVAILABILITY AND IMPLEMENTATION: Python codes and dataset are available at: https://github.com/longyahui/GCNMDA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
NTSHMDA: Prediction of Human Microbe-Disease Association Based on Random Walk by Integrating Network Topological SimilarityJiawei Luo, Yahui Long|IEEE/ACM Transactions on Computational Biology and Bioinformatics|2018 Accumulating clinic evidences have demonstrated that the microbes residing in human bodies play a significantly important role in the formation, development, and progression of various complex human diseases. Identifying latent related microbes for disease could provide insight into human disease mechanisms and promote disease prevention, diagnosis, and treatment. In this paper, we first construct a heterogeneous network by connecting the disease similarity network and the microbe similarity network through known microbe-disease association network, and then develop a novel computational model to predict human microbe-disease associations based on random walk by integrating network topological similarity (NTSHMDA). Specifically, each microbe-disease association pair is regarded as a distinct relationship level and, thus, assigned different weights based on network topological similarity. The experimental results show that NTSHMDA outperforms some state-of-the-art methods with average AUCs of 0.9070, 0.8896 ± 0.0038 in the frameworks of Leave-one-out cross validation and 5-fold cross validation, respectively. In case studies, 9, 18, 38 and 9, 18, 45 out of top-10, 20, 50 candidate microbes are verified by recently published literatures for asthma and inflammatory bowel disease, respectively. In conclusion, NTSHMDA has potential ability to identify novel disease-microbe associations and can also provide valuable information for drug discovery and biological researches.
Deciphering spatial domains from spatial multi-omics with SpatialGlueAdvances in spatial omics technologies now allow multiple types of data to be acquired from the same tissue slice. To realize the full potential of such data, we need spatially informed methods for data integration. Here, we introduce SpatialGlue, a graph neural network model with a dual-attention mechanism that deciphers spatial domains by intra-omics integration of spatial location and omics measurement followed by cross-omics integration. We demonstrated SpatialGlue on data acquired from different tissue types using different technologies, including spatial epigenome-transcriptome and transcriptome-proteome modalities. Compared to other methods, SpatialGlue captured more anatomical details and more accurately resolved spatial domains such as the cortex layers of the brain. Our method also identified cell types like spleen macrophage subsets located at three different zones that were not available in the original data annotations. SpatialGlue scales well with data size and can be used to integrate three modalities. Our spatial multi-omics analysis tool combines the information from complementary omics modalities to obtain a holistic view of cellular and tissue properties.