Ten Years of Pathway Analysis: Current Approaches and Outstanding ChallengesPathway analysis has become the first choice for gaining insight into the underlying biology of differentially expressed genes and proteins, as it reduces complexity and has increased explanatory power. We discuss the evolution of knowledge base-driven pathway analysis over its first decade, distinctly divided into three generations. We also discuss the limitations that are specific to each generation, and how they are addressed by successive generations of methods. We identify a number of annotation challenges that must be addressed to enable development of the next generation of pathway analysis methods. Furthermore, we identify a number of methodological challenges that the next generation of methods must tackle to take advantage of the technological advances in genomics and proteomics in order to improve specificity, sensitivity, and relevance of pathway analysis.
Systems biological assessment of immunity to mild versus severe COVID-19 infection in humansCoronavirus disease 2019 (COVID-19) represents a global crisis, yet major knowledge gaps remain about human immunity to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). We analyzed immune responses in 76 COVID-19 patients and 69 healthy individuals from Hong Kong and Atlanta, Georgia, United States. In the peripheral blood mononuclear cells (PBMCs) of COVID-19 patients, we observed reduced expression of human leukocyte antigen class DR (HLA-DR) and proinflammatory cytokines by myeloid cells as well as impaired mammalian target of rapamycin (mTOR) signaling and interferon-α (IFN-α) production by plasmacytoid dendritic cells. By contrast, we detected enhanced plasma levels of inflammatory mediators-including EN-RAGE, TNFSF14, and oncostatin M-which correlated with disease severity and increased bacterial products in plasma. Single-cell transcriptomics revealed a lack of type I IFNs, reduced HLA-DR in the myeloid cells of patients with severe COVID-19, and transient expression of IFN-stimulated genes. This was consistent with bulk PBMC transcriptomics and transient, low IFN-α levels in plasma during infection. These results reveal mechanisms and potential therapeutic targets for COVID-19.
A systems biology approach for pathway level analysisA common challenge in the analysis of genomics data is trying to understand the underlying phenomenon in the context of all complex interactions taking place on various signaling pathways. A statistical approach using various models is universally used to identify the most relevant pathways in a given experiment. Here, we show that the existing pathway analysis methods fail to take into consideration important biological aspects and may provide incorrect results in certain situations. By using a systems biology approach, we developed an impact analysis that includes the classical statistics but also considers other crucial factors such as the magnitude of each gene's expression change, their type and position in the given pathways, their interactions, etc. The impact analysis is an attempt to a deeper level of statistical analysis, informed by more pathway-specific biology than the existing techniques. On several illustrative data sets, the classical analysis produces both false positives and false negatives, while the impact analysis provides biologically meaningful results. This analysis method has been implemented as a Web-based tool, Pathway-Express, freely available as part of the Onto-Tools (http://vortex.cs.wayne.edu).
A novel signaling pathway impact analysisMOTIVATION: Gene expression class comparison studies may identify hundreds or thousands of genes as differentially expressed (DE) between sample groups. Gaining biological insight from the result of such experiments can be approached, for instance, by identifying the signaling pathways impacted by the observed changes. Most of the existing pathway analysis methods focus on either the number of DE genes observed in a given pathway (enrichment analysis methods), or on the correlation between the pathway genes and the class of the samples (functional class scoring methods). Both approaches treat the pathways as simple sets of genes, disregarding the complex gene interactions that these pathways are built to describe. RESULTS: We describe a novel signaling pathway impact analysis (SPIA) that combines the evidence obtained from the classical enrichment analysis with a novel type of evidence, which measures the actual perturbation on a given pathway under a given condition. A bootstrap procedure is used to assess the significance of the observed total pathway perturbation. Using simulations we show that the evidence derived from perturbations is independent of the pathway enrichment evidence. This allows us to calculate a global pathway significance P-value, which combines the enrichment and perturbation P-values. We illustrate the capabilities of the novel method on four real datasets. The results obtained on these data show that SPIA has better specificity and more sensitivity than several widely used pathway analysis methods. AVAILABILITY: SPIA was implemented as an R package available at http://vortex.cs.wayne.edu/ontoexpress/
Ontological analysis of gene expression data: current tools, limitations, and open problemsIndependent of the platform and the analysis methods used, the result of a microarray experiment is, in most cases, a list of differentially expressed genes. An automatic ontological analysis approach has been recently proposed to help with the biological interpretation of such results. Currently, this approach is the de facto standard for the secondary analysis of high throughput experiments and a large number of tools have been developed for this purpose. We present a detailed comparison of 14 such tools using the following criteria: scope of the analysis, visualization capabilities, statistical model(s) used, correction for multiple comparisons, reference microarrays available, installation issues and sources of annotation data. This detailed analysis of the capabilities of these tools will help researchers choose the most appropriate tool for a given type of analysis. More importantly, in spite of the fact that this type of analysis has been generally adopted, this approach has several important intrinsic drawbacks. These drawbacks are associated with all tools discussed and represent conceptual limitations of the current state-of-the-art in ontological analysis. We propose these as challenges for the next generation of secondary data analysis tools.