<i>K</i><sub>DEEP</sub>: Protein–Ligand Absolute Binding Affinity Prediction via 3D-Convolutional Neural NetworksJosé Jiménez-Luna, Miha Škalič, Gérard Martinez et al.|Journal of Chemical Information and Modeling|2018 Accurately predicting protein–ligand binding affinities is an important problem in computational chemistry since it can substantially accelerate drug discovery for virtual screening and lead optimization. We propose here a fast machine-learning approach for predicting binding affinities using state-of-the-art 3D-convolutional neural networks and compare this approach to other machine-learning and scoring methods using several diverse data sets. The results for the standard PDBbind (v.2016) core test-set are state-of-the-art with a Pearson’s correlation coefficient of 0.82 and a RMSE of 1.27 in pK units between experimental and predicted affinity, but accuracy is still very sensitive to the specific protein used. KDEEP is made available via PlayMolecule.org for users to test easily their own protein–ligand complexes, with each prediction taking a fraction of a second. We believe that the speed, performance, and ease of use of KDEEP makes it already an attractive scoring function for modern computational chemistry pipelines.
SUPPA2: fast, accurate, and uncertainty-aware differential splicing analysis across multiple conditionsDespite the many approaches to study differential splicing from RNA-seq, many challenges remain unsolved, including computing capacity and sequencing depth requirements. Here we present SUPPA2, a new method that addresses these challenges, and enables streamlined analysis across multiple conditions taking into account biological variability. Using experimental and simulated data, we show that SUPPA2 achieves higher accuracy compared to other methods, especially at low sequencing depth and short read length. We use SUPPA2 to identify novel Transformer2-regulated exons, novel microexons induced during differentiation of bipolar neurons, and novel intron retention events during erythroblast differentiation.
Shape-Based Generative Modeling for de Novo Drug DesignMiha Škalič, José Jiménez-Luna, Davide Sabbadin et al.|Journal of Chemical Information and Modeling|2019 In this work, we propose a machine learning approach to generate novel molecules starting from a seed compound, its three-dimensional (3D) shape, and its pharmacophoric features. The pipeline draws inspiration from generative models used in image analysis and represents a first example of the de novo design of lead-like molecules guided by shape-based features. A variational autoencoder is used to perturb the 3D representation of a compound, followed by a system of convolutional and recurrent neural networks that generate a sequence of SMILES tokens. The generative design of novel scaffolds and functional groups can cover unexplored regions of chemical space that still possess lead-like properties.
From Target to Drug: Generative Modeling for the Multimodal Structure-Based Ligand DesignMiha Škalič, Davide Sabbadin, Boris Sattarov et al.|Molecular Pharmaceutics|2019 Chemical space is impractically large, and conventional structure-based virtual screening techniques cannot be used to simply search through the entire space to discover effective bioactive molecules. To address this shortcoming, we propose a generative adversarial network to generate, rather than search, diverse three-dimensional ligand shapes complementary to the pocket. Furthermore, we show that the generated molecule shapes can be decoded using a shape-captioning network into a sequence of SMILES enabling directly the structure-based de novo drug design. We evaluate the quality of the method by both structure- (docking) and ligand-based [quantitative structure-activity relationship (QSAR)] virtual screening methods. For both evaluation approaches, we observed enrichment compared to random sampling from initial chemical space of ZINC drug-like compounds.
Coloring Molecules with Explainable Artificial Intelligence for Preclinical Relevance AssessmentJosé Jiménez-Luna, Miha Škalič, Nils Weskamp et al.|Journal of Chemical Information and Modeling|2021 molecule generation. However, these models are considered "black-box" and "hard-to-debug". This study aimed to improve modeling transparency for rational molecular design by applying the integrated gradients explainable artificial intelligence (XAI) approach for graph neural network models. Models were trained for predicting plasma protein binding, hERG channel inhibition, passive permeability, and cytochrome P450 inhibition. The proposed methodology highlighted molecular features and structural elements that are in agreement with known pharmacophore motifs, correctly identified property cliffs, and provided insights into unspecific ligand-target interactions. The developed XAI approach is fully open-sourced and can be used by practitioners to train new models on other clinically relevant endpoints.