3D Infomax improves GNNs for Molecular Property Prediction

Stärk, Hannes, Beaini, Dominique, Corso, Gabriele, Tossou, Prudencio, Dallago, Christian, Günnemann, Stephan, Liò, Pietro

arXiv.org Artificial Intelligence 

Molecular property prediction is one of the fastest-growing applications of deep learning with critical real-world impacts. Including 3D molecular structure as input to learned models improves their performance for many molecular tasks. However, this information is infeasible to compute at the scale required by several real-world applications. We propose pre-training a model to reason about the geometry of molecules given only their 2D molecular graphs. Using methods from self-supervised learning, we maximize the mutual information between 3D summary vectors and the representations of a Graph Neural Network (GNN) such that they contain latent 3D information. During fine-tuning on molecules with unknown geometry, the GNN still produces implicit 3D information and can use it to improve downstream tasks. We show that 3D pre-training provides significant improvements for a wide range of properties, such as a 22% average MAE reduction on eight quantum mechanical properties. Moreover, the learned representations can be effectively transferred between datasets in different molecular spaces. The understanding of molecular and quantum chemistry is a rapidly growing area for deep learning with models having direct real-world impacts in quantum chemistry (Dral, 2020), protein structure prediction (Jumper et al., 2021), materials science (Schmidt et al., 2019), and drug discovery (Stokes et al., 2020). In particular, for the task of molecular property prediction, GNNs have had great success (Yang et al., 2019). GNNs operate on the molecular graph by updating each atom's representation based on the atoms connected to it via covalent bonds. However, these models reason poorly about other important interatomic forces that depend on the atoms' relative positions in space. Previous works showed that using the atoms' 3D coordinates in space improves the accuracy of molecular property prediction (Schütt et al., 2017; Klicpera et al., 2020b; Liu et al., 2021; Klicpera et al., 2021). However, using classical molecular dynamics simulations to explicitly compute a molecule's geometry before predicting its properties is computationally intractable for many real-world applications. Even recent Machine Learning (ML) methods for conformation generation (Xu et al., 2021b; Shi et al., 2021; Ganea et al., 2021) are still too slow for large-scale applications. A GNN is pre-trained by maximizing the mutual information (MI) between its embedding of a 2D molecular graph and a representation capturing the 3D information that is produced by a separate network.