jt-vae
Appendix AAnalysisofvarianceofuncertaintyestimators
We sample 1,000 points at random in latent space from an isotropic Gaussian with fixed standard deviation σ (we repeat the experiment for different values of the standard deviation). We analyze the impact of the number ofys samples for each estimator on the variance of the corresponding estimators, measured over 10 independent runs. We use the sum of all pixel intensities in the image as a proxy of the thickness of the digits, which provides strongempiricalresults. Wejointly train avariational autoencoder with an auxiliary network (the "Property network") predicting digit thickness based on latent representation (see Figure 1). We evaluate the IS-MI estimator values in different regions of latent space.
06fe1c234519f6812fc4c1baae25d6af-Paper.pdf
However,existing methods lackrobustness as they may decide to explore areas of the latent space for which no data was available during training andwhere thedecoder canbeunreliable, leading tothe generation ofunrealistic orinvalidobjects. Wepropose toleveragetheepistemic uncertainty of the decoder to guide the optimization process.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Leveraging Active Subspaces to Capture Epistemic Model Uncertainty in Deep Generative Models for Molecular Design
Abeer, A N M Nafiz, Jantre, Sanket, Urban, Nathan M, Yoon, Byung-Jun
Deep generative models have been accelerating the inverse design process in material and drug design. Unlike their counterpart property predictors in typical molecular design frameworks, generative molecular design models have seen fewer efforts on uncertainty quantification (UQ) due to computational challenges in Bayesian inference posed by their large number of parameters. In this work, we focus on the junction-tree variational autoencoder (JT-VAE), a popular model for generative molecular design, and address this issue by leveraging the low dimensional active subspace to capture the uncertainty in the model parameters. Specifically, we approximate the posterior distribution over the active subspace parameters to estimate the epistemic model uncertainty in an extremely high dimensional parameter space. The proposed UQ scheme does not require alteration of the model architecture, making it readily applicable to any pre-trained model. Our experiments demonstrate the efficacy of the AS-based UQ and its potential impact on molecular optimization by exploring the model diversity under epistemic uncertainty.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.89)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.61)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Multi-Objective Latent Space Optimization of Generative Molecular Design Models
Abeer, A N M Nafiz, Urban, Nathan, Weil, M Ryan, Alexander, Francis J., Yoon, Byung-Jun
Molecular design based on generative models, such as variational autoencoders (VAEs), has become increasingly popular in recent years due to its efficiency for exploring high-dimensional molecular space to identify molecules with desired properties. While the efficacy of the initial model strongly depends on the training data, the sampling efficiency of the model for suggesting novel molecules with enhanced properties can be further enhanced via latent space optimization. In this paper, we propose a multi-objective latent space optimization (LSO) method that can significantly enhance the performance of generative molecular design (GMD). The proposed method adopts an iterative weighted retraining approach, where the respective weights of the molecules in the training data are determined by their Pareto efficiency. We demonstrate that our multi-objective GMD LSO method can significantly improve the performance of GMD for jointly optimizing multiple molecular properties.
- North America > United States > Texas > Brazos County > College Station (0.14)
- North America > United States > Maryland > Frederick County > Frederick (0.04)
- North America > United States > Illinois > Cook County > Lemont (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)
Graph Machine Learning for Design of High-Octane Fuels
Rittig, Jan G., Ritzert, Martin, Schweidtmann, Artur M., Winkler, Stefanie, Weber, Jana M., Morsch, Philipp, Heufer, K. Alexander, Grohe, Martin, Mitsos, Alexander, Dahmen, Manuel
Fuels with high-knock resistance enable modern spark-ignition engines to achieve high efficiency and thus low CO2 emissions. Identification of molecules with desired autoignition properties indicated by a high research octane number and a high octane sensitivity is therefore of great practical relevance and can be supported by computer-aided molecular design (CAMD). Recent developments in the field of graph machine learning (graph-ML) provide novel, promising tools for CAMD. We propose a modular graph-ML CAMD framework that integrates generative graph-ML models with graph neural networks and optimization, enabling the design of molecules with desired ignition properties in a continuous molecular space. In particular, we explore the potential of Bayesian optimization and genetic algorithms in combination with generative graph-ML models. The graph-ML CAMD framework successfully identifies well-established high-octane components. It also suggests new candidates, one of which we experimentally investigate and use to illustrate the need for further auto-ignition training data.
- North America > United States (0.46)
- Europe > Netherlands (0.28)
- Europe > Denmark (0.14)
- (2 more...)
- Energy > Renewable (1.00)
- Energy > Oil & Gas > Downstream (0.93)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
- Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.46)
Daily Digest
Chemical probes are important tools for understanding biological systems. However, because of the huge combinatorial space of targets and potential compounds, traditional chemical screens cannot be applied systematically to find probes for all possible druggable targets. Here, researchers demonstrate a novel concept for overcoming this challenge by leveraging high-throughput metabolomics and overexpression to predict drug–target interactions. The metabolome profiles of yeast treated with 1,280 compounds from a chemical library were collected and compared with those of inducible yeast membrane protein overexpression strains. By matching metabolome profiles, they predicted which small molecules targeted which signaling systems and recovered known interactions.
Benchmarking Deep Graph Generative Models for Optimizing New Drug Molecules for COVID-19
Ward, Logan, Bilbrey, Jenna A., Choudhury, Sutanay, Kumar, Neeraj, Sivaraman, Ganesh
Design of new drug compounds with target properties is a key area of research in generative modeling. We present a small drug molecule design pipeline based on graph-generative models and a comparison study of two state-of-the-art graph generative models for designing COVID-19 targeted drug candidates: 1) a variational autoencoder-based approach (VAE) that uses prior knowledge of molecules that have been shown to be effective for earlier coronavirus treatments and 2) a deep Q-learning method (DQN) that generates optimized molecules without any proximity constraints. We evaluate the novelty of the automated molecule generation approaches by validating the candidate molecules with drug-protein binding affinity models. The VAE method produced two novel molecules with similar structures to the antiretroviral protease inhibitor Indinavir that show potential binding affinity for the SARS-CoV-2 protein target 3-chymotrypsin-like protease (3CL-protease).