Goto

Collaborating Authors

 jt-vae


Appendix AAnalysisofvarianceofuncertaintyestimators

Neural Information Processing Systems

We sample 1,000 points at random in latent space from an isotropic Gaussian with fixed standard deviation σ (we repeat the experiment for different values of the standard deviation). We analyze the impact of the number ofys samples for each estimator on the variance of the corresponding estimators, measured over 10 independent runs. We use the sum of all pixel intensities in the image as a proxy of the thickness of the digits, which provides strongempiricalresults. Wejointly train avariational autoencoder with an auxiliary network (the "Property network") predicting digit thickness based on latent representation (see Figure 1). We evaluate the IS-MI estimator values in different regions of latent space.


06fe1c234519f6812fc4c1baae25d6af-Paper.pdf

Neural Information Processing Systems

However,existing methods lackrobustness as they may decide to explore areas of the latent space for which no data was available during training andwhere thedecoder canbeunreliable, leading tothe generation ofunrealistic orinvalidobjects. Wepropose toleveragetheepistemic uncertainty of the decoder to guide the optimization process.


Leveraging Active Subspaces to Capture Epistemic Model Uncertainty in Deep Generative Models for Molecular Design

Abeer, A N M Nafiz, Jantre, Sanket, Urban, Nathan M, Yoon, Byung-Jun

arXiv.org Machine Learning

Deep generative models have been accelerating the inverse design process in material and drug design. Unlike their counterpart property predictors in typical molecular design frameworks, generative molecular design models have seen fewer efforts on uncertainty quantification (UQ) due to computational challenges in Bayesian inference posed by their large number of parameters. In this work, we focus on the junction-tree variational autoencoder (JT-VAE), a popular model for generative molecular design, and address this issue by leveraging the low dimensional active subspace to capture the uncertainty in the model parameters. Specifically, we approximate the posterior distribution over the active subspace parameters to estimate the epistemic model uncertainty in an extremely high dimensional parameter space. The proposed UQ scheme does not require alteration of the model architecture, making it readily applicable to any pre-trained model. Our experiments demonstrate the efficacy of the AS-based UQ and its potential impact on molecular optimization by exploring the model diversity under epistemic uncertainty.


Multi-Objective Latent Space Optimization of Generative Molecular Design Models

Abeer, A N M Nafiz, Urban, Nathan, Weil, M Ryan, Alexander, Francis J., Yoon, Byung-Jun

arXiv.org Artificial Intelligence

Molecular design based on generative models, such as variational autoencoders (VAEs), has become increasingly popular in recent years due to its efficiency for exploring high-dimensional molecular space to identify molecules with desired properties. While the efficacy of the initial model strongly depends on the training data, the sampling efficiency of the model for suggesting novel molecules with enhanced properties can be further enhanced via latent space optimization. In this paper, we propose a multi-objective latent space optimization (LSO) method that can significantly enhance the performance of generative molecular design (GMD). The proposed method adopts an iterative weighted retraining approach, where the respective weights of the molecules in the training data are determined by their Pareto efficiency. We demonstrate that our multi-objective GMD LSO method can significantly improve the performance of GMD for jointly optimizing multiple molecular properties.


Graph Machine Learning for Design of High-Octane Fuels

Rittig, Jan G., Ritzert, Martin, Schweidtmann, Artur M., Winkler, Stefanie, Weber, Jana M., Morsch, Philipp, Heufer, K. Alexander, Grohe, Martin, Mitsos, Alexander, Dahmen, Manuel

arXiv.org Artificial Intelligence

Fuels with high-knock resistance enable modern spark-ignition engines to achieve high efficiency and thus low CO2 emissions. Identification of molecules with desired autoignition properties indicated by a high research octane number and a high octane sensitivity is therefore of great practical relevance and can be supported by computer-aided molecular design (CAMD). Recent developments in the field of graph machine learning (graph-ML) provide novel, promising tools for CAMD. We propose a modular graph-ML CAMD framework that integrates generative graph-ML models with graph neural networks and optimization, enabling the design of molecules with desired ignition properties in a continuous molecular space. In particular, we explore the potential of Bayesian optimization and genetic algorithms in combination with generative graph-ML models. The graph-ML CAMD framework successfully identifies well-established high-octane components. It also suggests new candidates, one of which we experimentally investigate and use to illustrate the need for further auto-ignition training data.


Daily Digest

#artificialintelligence

Chemical probes are important tools for understanding biological systems. However, because of the huge combinatorial space of targets and potential compounds, traditional chemical screens cannot be applied systematically to find probes for all possible druggable targets. Here, researchers demonstrate a novel concept for overcoming this challenge by leveraging high-throughput metabolomics and overexpression to predict drug–target interactions. The metabolome profiles of yeast treated with 1,280 compounds from a chemical library were collected and compared with those of inducible yeast membrane protein overexpression strains. By matching metabolome profiles, they predicted which small molecules targeted which signaling systems and recovered known interactions.


Benchmarking Deep Graph Generative Models for Optimizing New Drug Molecules for COVID-19

Ward, Logan, Bilbrey, Jenna A., Choudhury, Sutanay, Kumar, Neeraj, Sivaraman, Ganesh

arXiv.org Artificial Intelligence

Design of new drug compounds with target properties is a key area of research in generative modeling. We present a small drug molecule design pipeline based on graph-generative models and a comparison study of two state-of-the-art graph generative models for designing COVID-19 targeted drug candidates: 1) a variational autoencoder-based approach (VAE) that uses prior knowledge of molecules that have been shown to be effective for earlier coronavirus treatments and 2) a deep Q-learning method (DQN) that generates optimized molecules without any proximity constraints. We evaluate the novelty of the automated molecule generation approaches by validating the candidate molecules with drug-protein binding affinity models. The VAE method produced two novel molecules with similar structures to the antiretroviral protease inhibitor Indinavir that show potential binding affinity for the SARS-CoV-2 protein target 3-chymotrypsin-like protease (3CL-protease).