Henry Heberle

InteractiVenn: a web-based tool for the analysis of sets through Venn diagrams

Henry Heberle, Gabriela Vaz Meirelles, Felipe Rodrigues da Silva et al.|BMC Bioinformatics|2015

Cited by 2.6kOpen Access

BACKGROUND: Set comparisons permeate a large number of data analysis workflows, in particular workflows in biological sciences. Venn diagrams are frequently employed for such analysis but current tools are limited. RESULTS: We have developed InteractiVenn, a more flexible tool for interacting with Venn diagrams including up to six sets. It offers a clean interface for Venn diagram construction and enables analysis of set unions while preserving the shape of the diagram. Set unions are useful to reveal differences and similarities among sets and may be guided in our tool by a tree or by a list of set unions. The tool also allows obtaining subsets' elements, saving and loading sets for further analyses, and exporting the diagram in vector and image formats. InteractiVenn has been used to analyze two biological datasets, but it may serve set analysis in a broad range of domains. CONCLUSIONS: InteractiVenn allows set unions in Venn diagrams to be explored thoroughly, by consequence extending the ability to analyze combinations of sets with additional observations, yielded by novel interactions between joined sets. InteractiVenn is freely available online at: www.interactivenn.net .

Combining discovery and targeted proteomics reveals a prognostic signature in oral cancer

Carolina Moretto Carnielli, Carolina Carneiro Soares Macedo, Tatiane De Rossi et al.|Nature Communications|2018

Cited by 197Open Access

Different regions of oral squamous cell carcinoma (OSCC) have particular histopathological and molecular characteristics limiting the standard tumor-node-metastasis prognosis classification. Therefore, defining biological signatures that allow assessing the prognostic outcomes for OSCC patients would be of great clinical significance. Using histopathology-guided discovery proteomics, we analyze neoplastic islands and stroma from the invasive tumor front (ITF) and inner tumor to identify differentially expressed proteins. Potential signature proteins are prioritized and further investigated by immunohistochemistry (IHC) and targeted proteomics. IHC indicates low expression of cystatin-B in neoplastic islands from the ITF as an independent marker for local recurrence. Targeted proteomics analysis of the prioritized proteins in saliva, combined with machine-learning methods, highlights a peptide-based signature as the most powerful predictor to distinguish patients with and without lymph node metastasis. In summary, we identify a robust signature, which may enhance prognostic decisions in OSCC and better guide treatment to reduce tumor recurrence or lymph node metastasis.

ChemInformatics Model Explorer (CIME): exploratory analysis of chemical model explanations

Christina Humer, Henry Heberle, Floriane Montanari et al.|Journal of Cheminformatics|2022

Cited by 38Open Access

The introduction of machine learning to small molecule research- an inherently multidisciplinary field in which chemists and data scientists combine their expertise and collaborate - has been vital to making screening processes more efficient. In recent years, numerous models that predict pharmacokinetic properties or bioactivity have been published, and these are used on a daily basis by chemists to make decisions and prioritize ideas. The emerging field of explainable artificial intelligence is opening up new possibilities for understanding the reasoning that underlies a model. In small molecule research, this means relating contributions of substructures of compounds to their predicted properties, which in turn also allows the areas of the compounds that have the greatest influence on the outcome to be identified. However, there is no interactive visualization tool that facilitates such interdisciplinary collaborations towards interpretability of machine learning models for small molecules. To fill this gap, we present CIME (ChemInformatics Model Explorer), an interactive web-based system that allows users to inspect chemical data sets, visualize model explanations, compare interpretability techniques, and explore subgroups of compounds. The tool is model-agnostic and can be run on a server or a workstation.

XSMILES: interactive visualization for molecules, SMILES and XAI attribution scores

Henry Heberle, Linlin Zhao, Sebastian Schmidt et al.|Journal of Cheminformatics|2023

Cited by 32Open Access

BACKGROUND: Explainable artificial intelligence (XAI) methods have shown increasing applicability in chemistry. In this context, visualization techniques can highlight regions of a molecule to reveal their influence over a predicted property. For this purpose, some XAI techniques calculate attribution scores associated with tokens of SMILES strings or with atoms of a molecule. While an association of a score with an atom can be directly visually represented on a molecule diagram, scores computed for SMILES non-atom tokens cannot. For instance, a substring [N+] contains 3 non-atom tokens, i.e., [, [Formula: see text], and ], and their attributions, depending on the model, are not necessarily revealing an influence of the nitrogen atom over the predicted property; for that reason, it is not possible to represent the scores on a molecule diagram. Moreover, SMILES's notation is complex, foregrounding the need for techniques to facilitate the analysis of explanations associated with their tokens. RESULTS: We propose XSMILES, an interactive visualization technique, to explore explainable artificial intelligence attributions scores and support the interpretation of SMILES. Users can input any type of score attributed to atom and non-atom tokens and visualize them on top of a 2D molecule diagram coordinated with a bar chart that represents a SMILES string. We demonstrate how attributions calculated for SMILES strings can be evaluated and better interpreted through interactivity with two use cases. CONCLUSIONS: Data scientists can use XSMILES to understand their models' behavior and compare multiple modeling approaches. The tool provides a set of parameters to adapt the visualization to users' needs and it can be integrated into different platforms. We believe XSMILES can support data scientists to develop, improve, and communicate their models by making it easier to identify patterns and compare attributions through interactive exploratory visualization.

Connecting multiple microenvironment proteomes uncovers the biology in head and neck cancer

Ariane F. Busso‐Lopes, Leandro Xavier Neves, Guilherme A. Câmara et al.|Nature Communications|2022

Cited by 27Open Access

The poor prognosis of head and neck cancer (HNC) is associated with metastasis within the lymph nodes (LNs). Herein, the proteome of 140 multisite samples from a 59-HNC patient cohort, including primary and matched LN-negative or -positive tissues, saliva, and blood cells, reveals insights into the biology and potential metastasis biomarkers that may assist in clinical decision-making. Protein profiles are strictly associated with immune modulation across datasets, and this provides the basis for investigating immune markers associated with metastasis. The proteome of LN metastatic cells recapitulates the proteome of the primary tumor sites. Conversely, the LN microenvironment proteome highlights the candidate prognostic markers. By integrating prioritized peptide, protein, and transcript levels with machine learning models, we identify nodal metastasis signatures in blood and saliva. We present a proteomic characterization wiring multiple sites in HNC, thus providing a promising basis for understanding tumoral biology and identifying metastasis-associated signatures.

Is this you? Claim your profile.

Top publicationsby citations