Riley, Patrick
Scaling Symbolic Methods using Gradients for Neural Model Explanation
Sahoo, Subham Sekhar, Venugopalan, Subhashini, Li, Li, Singh, Rishabh, Riley, Patrick
Symbolic techniques based on Satisfiability Modulo Theory (SMT) solvers have been proposed for analyzing and verifying neural network properties, but their usage has been fairly limited owing to their poor scalability with larger networks. In this work, we propose a technique for combining gradient-based methods with symbolic techniques to scale such analyses and demonstrate its application for model explanation. In particular, we apply this technique to identify minimal regions in an input that are most relevant for a neural network's prediction. Our approach uses gradient information (based on Integrated Gradients [23]) to focus on a subset of neurons in the first layer, which allows our technique to scale to large networks. The corresponding SMT constraints encode the minimal input mask discovery problem such that after masking the input, the activations of the selected neurons are still above a threshold. After solving for the minimal masks, our approach scores the mask regions to generate a relative ordering of the features within the mask. This produces a saliency map which explains "where a model is looking" when making a prediction. We evaluate our technique on three datasets - MNIST, ImageNet, and Beer Reviews, and demonstrate both quantitatively and qualitatively that the regions generated by our approach are sparser and achieve higher saliency scores compared to the gradient-based methods alone.
Decoding Molecular Graph Embeddings with Reinforcement Learning
Kearnes, Steven, Li, Li, Riley, Patrick
We present RL-VAE, a graph-to-graph variational autoencoder that uses reinforcement learning to decode molecular graphs from latent embeddings. Methods have been described previously for graph-to-graph autoencoding, but these approaches require sophisticated decoders that increase the complexity of training and evaluation (such as requiring parallel encoders and decoders or non-trivial graph matching). Here, we repurpose a simple graph generator to enable efficient decoding and generation of molecular graphs.
Neural-Guided Symbolic Regression with Semantic Prior
Li, Li, Fan, Minjie, Singh, Rishabh, Riley, Patrick
Symbolic regression has been shown to be quite useful in many domains from discovering scientific laws to industrial empirical modeling. Existing methods focus on numerically fitting the given data. However, in many domains, symbolically derivable properties of the desired expressions are known. We illustrate these "semantic priors" with leading powers (the polynomial behavior as the input approaches 0 and $\infty$). We introduce an expression generating neural network that significantly favors the generation of expressions with desired leading powers, even generalizing to powers not in the training set. We then describe our Neural-Guided Monte Carlo Tree Search (NG-MCTS) algorithm for symbolic regression. We extensively evaluate our method on thousands of symbolic regression tasks and desired expressions to show that it significantly outperforms baseline algorithms and exhibits discovery of novel expressions outside of the training set.
Optimization of Molecules via Deep Reinforcement Learning
Zhou, Zhenpeng, Kearnes, Steven, Li, Li, Zare, Richard N., Riley, Patrick
We present a framework, which we call Molecule Deep $Q$-Networks (MolDQN), for molecule optimization by combining domain knowledge of chemistry and state-of-the-art reinforcement learning techniques (prioritized experience replay, double $Q$-learning, and randomized value functions). We directly define modifications on molecules, thereby ensuring 100% chemical validity. Further, we operate without pre-training on any dataset to avoid possible bias from the choice of that set. As a result, our model outperforms several other state-of-the-art algorithms by having a higher success rate of acquiring molecules with better properties. Inspired by problems faced during medicinal chemistry lead optimization, we extend our model with multi-objective reinforcement learning, which maximizes drug-likeness while maintaining similarity to the original molecule. We further show the path through chemical space to achieve optimization for a molecule to understand how the model works.
Tensor Field Networks: Rotation- and Translation-Equivariant Neural Networks for 3D Point Clouds
Thomas, Nathaniel, Smidt, Tess, Kearnes, Steven, Yang, Lusann, Li, Li, Kohlhoff, Kai, Riley, Patrick
We introduce tensor field networks, which are locally equivariant to 3D rotations and translations (and invariant to permutations of points) at every layer. 3D rotation equivariance removes the need for data augmentation to identify features in arbitrary orientations. Our network uses filters built from spherical harmonics; due to the mathematical consequences of this filter choice, each layer accepts as input (and guarantees as output) scalars, vectors, and higher-order tensors, in the geometric sense of these terms. We demonstrate how tensor field networks learn to model simple physics (Newtonian gravitation and moment of inertia), classify simple 3D shapes (trained on one orientation and tested on shapes in arbitrary orientations), and, given a small organic molecule with an atom removed, replace the correct element at the correct location in space.
Molecular Graph Convolutions: Moving Beyond Fingerprints
Kearnes, Steven, McCloskey, Kevin, Berndl, Marc, Pande, Vijay, Riley, Patrick
Molecular "fingerprints" encoding structural information are the workhorse of cheminformatics and machine learning in drug discovery applications. However, fingerprint representations necessarily emphasize particular aspects of the molecular structure while ignoring others, rather than allowing the model to make data-driven decisions. We describe molecular "graph convolutions", a machine learning architecture for learning from undirected graphs, specifically small molecules. Graph convolutions use a simple encoding of the molecular graph---atoms, bonds, distances, etc.---which allows the model to take greater advantage of information in the graph structure. Although graph convolutions do not outperform all fingerprint-based methods, they (along with other graph-based methods) represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement.
Massively Multitask Networks for Drug Discovery
Ramsundar, Bharath, Kearnes, Steven, Riley, Patrick, Webster, Dale, Konerding, David, Pande, Vijay
Massively multitask neural architectures provide a learning framework for drug discovery that synthesizes information from many distinct biological sources. To train these architectures at scale, we gather large amounts of data from public sources to create a dataset of nearly 40 million measurements across more than 200 biological targets. We investigate several aspects of the multitask framework by performing a series of empirical studies and obtain some interesting results: (1) massively multitask networks obtain predictive accuracies significantly better than single-task methods, (2) the predictive power of multitask networks improves as additional tasks and data are added, (3) the total amount of data and the total number of tasks both contribute significantly to multitask improvement, and (4) multitask networks afford limited transferability to tasks not in the training set. Our results underscore the need for greater data sharing and further algorithmic innovation to accelerate the drug discovery process.
SPADES: A System for Parallel-Agent, Discrete-Event Simulation
Riley, Patrick
Simulations are an excellent tool for studying AI. However, the simulation technology in use by, and designed for, the AI community often fails to take advantage of much of the work in the larger simulation community to produce stable, repeatable, and efficient simulations. I present SPADES (SYSTEM FOR PARALLEL-AGENT DISCRETE-EVENT SIMULATION) as a simulation substrate for the AI community. SPADES focuses on the agent as a fundamental simulation component. The "thinking time" of an agent is tracked and reflected in the results of the agents' actions. SPADES supports and manages the distribution of agents across machines while it is robust to variations in network performance and machine load. SPADES is not tied to any particular simulation and is a powerful new tool for creating simulations for the study of AI.
The CMUnited-99 Champion Simulator Team
Stone, Peter, Riley, Patrick, Veloso, Manuela M.
The CMUNITED-99 simulator team became the 1999 RoboCup simulator league champion by winning all 8 of its games, outscoring opponents by a combined score of 110-0. CMUNITED-99 builds on the successful CMUNITED-98 implementation but also improves on it in many ways. This article gives an overview of CMUNITED-99's improvements over CMUNITED-98.
CMUNITED-98 Simulator Team
Stone, Peter, Veloso, Manuela M., Riley, Patrick
The CMUNITED-98 simulator team became the 1998 RoboCup simulator league champion by winning all 8 of its games, outscoring opponents by a total of 66-0. CMUNITED-98 builds on the successful cmunited-97 implementation but also improves on it in many ways. This article gives an overview of the cmunited-98 agent skill and multiagent coordination strategies, emphasizing the recent improvements.