Goto

Collaborating Authors

 qm9






We thank the reviewers for the thoughtful feedback in these difficult times caused by the global COVID-19 pandemic

Neural Information Processing Systems

We thank the reviewers for the thoughtful feedback in these difficult times caused by the global COVID-19 pandemic. QM9 is used for training, the model must be based on LCAO, and QDF achieved high extrapolation performance. We emphasize that even this LDA-like HK map achieved high extrapolation performance. We will address this in future work. Of course, QDF can be proposed without a comparison to GCN.


f9b9f0fef2274a6b7009b5d52f44a3b6-AuthorFeedback.pdf

Neural Information Processing Systems

The fundamental difference is between "many to one" Figure 1 shows example generations from the model trained on the ChEMBL. W e actually did run an RL baseline (Eq. W e discuss the work of Norouzi et al. [2016] in detail in Section 3.3. They also do not use the entropy term in training, only to motivate derivations.


Pre-training via Denoising for Molecular Property Prediction

arXiv.org Artificial Intelligence

Work done during an internship at DeepMind. Many important problems involving molecular property prediction from 3D structures have limited data, posing a generalization challenge for neural networks. In this paper, we describe a pre-training technique based on denoising that achieves a new state-of-the-art in molecular property prediction by utilizing large datasets of 3D molecular structures at equilibrium to learn meaningful representations for downstream tasks. Relying on the well-known link between denoising autoencoders and score-matching, we show that the denoising objective corresponds to learning a molecular force field - arising from approximating the Boltzmann distribution with a mixture of Gaussians - directly from equilibrium structures. Our experiments demonstrate that using this pre-training objective significantly improves performance on multiple benchmarks, achieving a new state-of-the-art on the majority of targets in the widely used QM9 dataset. Our analysis then provides practical insights into the effects of different factors - dataset sizes, model size and architecture, and the choice of upstream and downstream datasets - on pre-training. The success of the best performing neural networks in vision and natural language processing (NLP) relies on pre-training the models on large datasets to learn meaningful features for downstream tasks (Dai & Le, 2015; Simonyan & Zisserman, 2014; Devlin et al., 2018; Brown et al., 2020; Dosovitskiy et al., 2020). For example, none of the best models on the widely used QM9 benchmark use any form of pre-training (e.g. Effective methods for pre-training could have a significant impact on fields such as drug discovery and material science. In this work, we focus on the problem of how large datasets of 3D molecular structures can be utilized to improve performance on downstream molecular property prediction tasks that also rely on 3D structures as input. Our answer is a form of self-supervised pre-training that generates useful representations for downstream prediction tasks, leading to state-of-the-art (SOTA) results. Inspired by recent advances in noise regularization for graph neural networks (GNNs) (Godwin et al., 2022), our pre-training objective is based on denoising in the space of structures (and is hence self-supervised). Unlike existing pre-training methods, which largely focus on 2D graphs, our approach targets the setting where the downstream task involves 3D point clouds defining the molecular structure.


TorchMD-NET: Equivariant Transformers for Neural Network based Molecular Potentials

arXiv.org Artificial Intelligence

The prediction of quantum mechanical properties is historically plagued by a trade-off between accuracy and speed. Machine learning potentials have previously shown great success in this domain, reaching increasingly better accuracy while maintaining computational efficiency comparable with classical force fields. In this work we propose TorchMD-NET, a novel equivariant transformer (ET) architecture, outperforming state-of-the-art on MD17, ANI-1, and many QM9 targets in both accuracy and computational efficiency. Through an extensive attention weight analysis, we gain valuable insights into the black box predictor and show differences in the learned representation of conformers versus conformations sampled from molecular dynamics or normal modes. Furthermore, we highlight the importance of datasets including off-equilibrium conformations for the evaluation of molecular potentials.


3D Infomax improves GNNs for Molecular Property Prediction

arXiv.org Artificial Intelligence

Molecular property prediction is one of the fastest-growing applications of deep learning with critical real-world impacts. Including 3D molecular structure as input to learned models improves their performance for many molecular tasks. However, this information is infeasible to compute at the scale required by several real-world applications. We propose pre-training a model to reason about the geometry of molecules given only their 2D molecular graphs. Using methods from self-supervised learning, we maximize the mutual information between 3D summary vectors and the representations of a Graph Neural Network (GNN) such that they contain latent 3D information. During fine-tuning on molecules with unknown geometry, the GNN still produces implicit 3D information and can use it to improve downstream tasks. We show that 3D pre-training provides significant improvements for a wide range of properties, such as a 22% average MAE reduction on eight quantum mechanical properties. Moreover, the learned representations can be effectively transferred between datasets in different molecular spaces. The understanding of molecular and quantum chemistry is a rapidly growing area for deep learning with models having direct real-world impacts in quantum chemistry (Dral, 2020), protein structure prediction (Jumper et al., 2021), materials science (Schmidt et al., 2019), and drug discovery (Stokes et al., 2020). In particular, for the task of molecular property prediction, GNNs have had great success (Yang et al., 2019). GNNs operate on the molecular graph by updating each atom's representation based on the atoms connected to it via covalent bonds. However, these models reason poorly about other important interatomic forces that depend on the atoms' relative positions in space. Previous works showed that using the atoms' 3D coordinates in space improves the accuracy of molecular property prediction (Schütt et al., 2017; Klicpera et al., 2020b; Liu et al., 2021; Klicpera et al., 2021). However, using classical molecular dynamics simulations to explicitly compute a molecule's geometry before predicting its properties is computationally intractable for many real-world applications. Even recent Machine Learning (ML) methods for conformation generation (Xu et al., 2021b; Shi et al., 2021; Ganea et al., 2021) are still too slow for large-scale applications. A GNN is pre-trained by maximizing the mutual information (MI) between its embedding of a 2D molecular graph and a representation capturing the 3D information that is produced by a separate network.


Calibrated Uncertainty for Molecular Property Prediction using Ensembles of Message Passing Neural Networks

arXiv.org Machine Learning

Data-driven methods based on machine learning have the potential to accelerate analysis of atomic structures. However, machine learning models can produce overconfident predictions and it is therefore crucial to detect and handle uncertainty carefully. Here, we extend a message passing neural network designed specifically for predicting properties of molecules and materials with a calibrated probabilistic predictive distribution. The method presented in this paper differs from the previous work by considering both aleatoric and epistemic uncertainty in a unified framework, and by re-calibrating the predictive distribution on unseen data. Through computer experiments, we show that our approach results in accurate models for predicting molecular formation energies with calibrated uncertainty in and out of the training data distribution on two public molecular benchmark datasets, QM9 and PC9. The proposed method provides a general framework for training and evaluating neural network ensemble models that are able to produce accurate predictions of properties of molecules with calibrated uncertainty.