Assigning Stereochemistry to Single Diastereoisomers by GIAO NMR Calculation: The DP4 ProbabilitySteven G. Smith, Jonathan M. Goodman|Journal of the American Chemical Society|2010 GIAO NMR shift calculation has been applied to the challenging task of reliably assigning stereochemistry with quantifiable confidence when only one set of experimental data are available. We have compared several approaches for assigning a probability to each candidate structure and have tested the ability of these methods to distinguish up to 64 possible diastereoisomers of 117 different molecules, using NMR shifts obtained in rapid and computationally inexpensive single-point calculations on molecular mechanics geometries without time-consuming ab initio geometry optimization. We show that a probability analysis based on the errors in each (13)C or (1)H shift is significantly more successful at making correct assignments with high confidence than are probabilities based on the correlation coefficient and mean absolute error parameters. Our new probability measure, which we have termed DP4, complements the probabilities obtained from our previously developed CP3 parameter, which applies to the case of assigning a pair of diastereoisomers when one has both experimental data sets. We illustrate the application of DP4 to assigning the stereochemistry or structure of 21 natural products that were originally misassigned in the literature or that required extensive synthesis of diastereoisomers to establish their stereochemistry.
The molecular basis for selective inhibition of unconventional mRNA splicing by an IRE1-binding small moleculeBenedict C. S. Cross, Peter J. Bond, Paweł Sadowski et al.|Proceedings of the National Academy of Sciences|2012 IRE1 couples endoplasmic reticulum unfolded protein load to RNA cleavage events that culminate in the sequence-specific splicing of the Xbp1 mRNA and in the regulated degradation of diverse membrane-bound mRNAs. We report on the identification of a small molecule inhibitor that attains its selectivity by forming an unusually stable Schiff base with lysine 907 in the IRE1 endonuclease domain, explained by solvent inaccessibility of the imine bond in the enzyme-inhibitor complex. The inhibitor (abbreviated 4μ8C) blocks substrate access to the active site of IRE1 and selectively inactivates both Xbp1 splicing and IRE1-mediated mRNA degradation. Surprisingly, inhibition of IRE1 endonuclease activity does not sensitize cells to the consequences of acute endoplasmic reticulum stress, but rather interferes with the expansion of secretory capacity. Thus, the chemical reactivity and sterics of a unique residue in the endonuclease active site of IRE1 can be exploited by selective inhibitors to interfere with protein secretion in pathological settings.
A review of molecular representation in the age of machine learningDaniel Wigh, Jonathan M. Goodman, Alexei A. Lapkin|Wiley Interdisciplinary Reviews Computational Molecular Science|2022 Abstract Research in chemistry increasingly requires interdisciplinary work prompted by, among other things, advances in computing, machine learning, and artificial intelligence. Everyone working with molecules, whether chemist or not, needs an understanding of the representation of molecules in a machine‐readable format, as this is central to computational chemistry. Four classes of representations are introduced: string, connection table, feature‐based, and computer‐learned representations. Three of the most significant representations are simplified molecular‐input line‐entry system (SMILES), International Chemical Identifier (InChI), and the MDL molfile, of which SMILES was the first to successfully be used in conjunction with a variational autoencoder (VAE) to yield a continuous representation of molecules. This is noteworthy because a continuous representation allows for efficient navigation of the immensely large chemical space of possible molecules. Since 2018, when the first model of this type was published, considerable effort has been put into developing novel and improved methodologies. Most, if not all, researchers in the community make their work easily accessible on GitHub, though discussion of computation time and domain of applicability is often overlooked. Herein, we present questions for consideration in future work which we believe will make chemical VAEs even more accessible. This article is categorized under: Data Science > Chemoinformatics