Goto

Collaborating Authors

 Schneider, Jeff


Asynchronous Multi Agent Active Search

arXiv.org Machine Learning

Active search refers to the problem of efficiently locating targets in an unknown environment by actively making data-collection decisions, and has many applications including detecting gas leaks, radiation sources or human survivors of disasters using aerial and/or ground robots (agents). Existing active search methods are in general only amenable to a single agent, or if they extend to multi agent they require a central control system to coordinate the actions of all agents. However, such control systems are often impractical in robotics applications. In this paper, we propose two distinct active search algorithms called SPATS (Sparse Parallel Asynchronous Thompson Sampling) and LATSI (LAplace Thompson Sampling with Information gain) that allow for multiple agents to independently make data-collection decisions without a central coordinator. Throughout we consider that targets are sparsely located around the environment in keeping with compressive sensing assumptions and its applicability in real world scenarios. Additionally, while most common search algorithms assume that agents can sense the entire environment (e.g. compressive sensing) or sense point-wise (e.g. Bayesian Optimization) at all times, we make a realistic assumption that each agent can only sense a contiguous region of space at a time. We provide simulation results as well as theoretical analysis to demonstrate the efficacy of our proposed algorithms.


Neural Dynamical Systems: Balancing Structure and Flexibility in Physical Prediction

arXiv.org Machine Learning

We introduce Neural Dynamical Systems (NDS), a method of learning dynamical models in various gray-box settings which incorporates prior knowledge in the form of systems of ordinary differential equations. NDS uses neural networks to estimate free parameters of the system, predicts residual terms, and numerically integrates over time to predict future states. A key insight is that many real dynamic systems of interest are hard to model because the dynamics may vary across rollouts. We mitigate this problem by taking a trajectory of prior states as the input to NDS and train it to re-estimate system parameters using the preceding trajectory. We find that NDS learns dynamics with higher accuracy and fewer samples than a variety of deep learning methods that do not incorporate the prior knowledge and methods from the system identification literature which do. We demonstrate these advantages first on synthetic dynamical systems and then on real data captured from deuterium shots from a nuclear fusion reactor.


Neural Architecture Search with Bayesian Optimisation and Optimal Transport

Neural Information Processing Systems

Bayesian Optimisation (BO) refers to a class of methods for global optimisation of a function f which is only accessible via point evaluations. It is typically used in settings where f is expensive to evaluate. A common use case for BO in machine learning is model selection, where it is not possible to analytically model the generalisation performance of a statistical model, and we resort to noisy and expensive training and validation procedures to choose the best model. Conventional BO methods have focused on Euclidean and categorical domains, which, in the context of model selection, only permits tuning scalar hyper-parameters of machine learning algorithms. However, with the surge of interest in deep learning, there is an increasing demand to tune neural network architectures.


Multi-fidelity Gaussian Process Bandit Optimisation

Journal of Artificial Intelligence Research

In many scientific and engineering applications, we are tasked with the maximisation of an expensive to evaluate black box function f. Traditional settings for this problem assume just the availability of this single function. However, in many cases, cheap approximations to f may be obtainable. For example, the expensive real world behaviour of a robot can be approximated by a cheap computer simulation. We can use these approximations to eliminate low function value regions cheaply and use the expensive evaluations of f in a small but promising region and speedily identify the optimum. We formalise this task as a multi-fidelity bandit problem where the target function and its approximations are sampled from a Gaussian process. We develop MF-GP-UCB, a novel method based on upper confidence bound techniques. In our theoretical analysis we demonstrate that it exhibits precisely the above behaviour and achieves better bounds on the regret than strategies which ignore multi-fidelity information. Empirically, MF-GP-UCB outperforms such naive strategies and other multi-fidelity methods on several synthetic and real experiments.


ChemBO: Bayesian Optimization of Small Organic Molecules with Synthesizable Recommendations

arXiv.org Machine Learning

We describe ChemBO, a Bayesian Optimization framework for generating and optimizing organic molecules for desired molecular properties. This framework is useful in applications such as drug discovery, where an algorithm recommends new candidate molecules; these molecules first need to be synthesized and then tested for drug-like properties. The algorithm uses the results of past tests to recommend new ones so as to find good molecules efficiently. Most existing data-driven methods for this problem do not account for sample efficiency and/or fail to enforce realistic constraints on synthesizability. In this work, we explore existing kernels for molecules in the literature as well as propose a novel kernel which views a molecule as a graph. In ChemBO, we implement these kernels in a Gaussian process model. Then we explore the chemical space by traversing possible paths of molecular synthesis. Consequently, our approach provides a proposal synthesis path every time it recommends a new molecule to test, a crucial advantage when compared to existing methods. In our experiments, we demonstrate the efficacy of the proposed approach on several molecular optimization problems.


Deep Kinematic Models for Physically Realistic Prediction of Vehicle Trajectories

arXiv.org Machine Learning

While the trajectory without the vehicle model appears reasonable, it is physically impossible for a two-axle vehicle to execute its motion in such manner because its rear wheels cannot turn. The proposed approach outputs a trajectory that is kinematically feasible and correctly predicts that the actor will encroach into the neighboring lane. We summarize the main contributions of our work below: - We combine powerful deep methods with a kinematic two-axle vehicle motion model in order to output trajectory predictions with guaranteed physical realism; - While the idea is general and applicable to any deep architecture, we present an example application to a recently proposed state-of-the-art motion prediction method, using raster-ized images of vehicle context as input to convolutional neural networks (CNNs) [7]; - We evaluate the method on a large-scale, real-world data set collected by a fleet of SDVs, showing that the system provides accurate, kinematically feasible predictions that outperform the existing state-of-the-art. 2 Related work 2.1 Motion prediction in autonomous driving Accurate motion prediction of other vehicles is a critical component in many autonomous driving systems [9, 10, 11]. Prediction provides an estimate of future world state, which can be used to plan an optimal path for the SDV through a dynamic traffic environment. The current state (e.g., position, speed, acceleration) of vehicles around a SDV can be estimated using techniques such as a Kalman filter (KF) [12, 13]. A common approach for short time horizon predictions of future motion is to assume that the driver will not change any control inputs (steering, accelerator) and simply propagate vehicle's current estimated state over time using a physical model (e.g., a vehicle motion model) that captures the underlying kinematics [9]. For longer time horizons the performance of this approach degrades as the underlying assumption of constant controls becomes increasingly unlikely.


Tuning Hyperparameters without Grad Students: Scalable and Robust Bayesian Optimisation with Dragonfly

arXiv.org Artificial Intelligence

Bayesian Optimisation (BO), refers to a suite of techniques for global optimisation of expensive black box functions, which use introspective Bayesian models of the function to efficiently find the optimum. While BO has been applied successfully in many applications, modern optimisation tasks usher in new challenges where conventional methods fail spectacularly. In this work, we present Dragonfly, an open source Python library for scalable and robust BO. Dragonfly incorporates multiple recently developed methods that allow BO to be applied in challenging real world settings; these include better methods for handling higher dimensional domains, methods for handling multi-fidelity evaluations when cheap approximations of an expensive function are available, methods for optimising over structured combinatorial spaces, such as the space of neural network architectures, and methods for handling parallel evaluations. Additionally, we develop new methodological improvements in BO for selecting the Bayesian model, selecting the acquisition function, and optimising over complex domains with different variable types and additional constraints. We compare Dragonfly to a suite of other packages and algorithms for global optimisation and demonstrate that when the above methods are integrated, they enable significant improvements in the performance of BO. The Dragonfly library is available at dragonfly.github.io.


ProBO: a Framework for Using Probabilistic Programming in Bayesian Optimization

arXiv.org Machine Learning

Optimizing an expensive-to-query function is a common task in science and engineering, where it is beneficial to keep the number of queries to a minimum. A popular strategy is Bayesian optimization (BO), which leverages probabilistic models for this task. Most BO today uses Gaussian processes (GPs), or a few other surrogate models. However, there is a broad set of Bayesian modeling techniques that we may want to use to capture complex systems and reduce the number of queries. Probabilistic programs (PPs) are modern tools that allow for flexible model composition, incorporation of prior information, and automatic inference. In this paper, we develop ProBO, a framework for BO using only standard operations common to most PPs. This allows a user to drop in an arbitrary PP implementation and use it directly in BO. To do this, we describe black box versions of popular acquisition functions that can be used in our framework automatically, without model-specific derivation, and show how to optimize these functions. We also introduce a model, which we term the Bayesian Product of Experts, that integrates into ProBO and can be used to combine information from multiple models implemented with different PPs. We show empirical results using multiple PP implementations, and compare against standard BO methods.


Neural Architecture Search with Bayesian Optimisation and Optimal Transport

Neural Information Processing Systems

Bayesian Optimisation (BO) refers to a class of methods for global optimisation of a function f which is only accessible via point evaluations. It is typically used in settings where f is expensive to evaluate. A common use case for BO in machine learning is model selection, where it is not possible to analytically model the generalisation performance of a statistical model, and we resort to noisy and expensive training and validation procedures to choose the best model. Conventional BO methods have focused on Euclidean and categorical domains, which, in the context of model selection, only permits tuning scalar hyper-parameters of machine learning algorithms. However, with the surge of interest in deep learning, there is an increasing demand to tune neural network architectures. In this work, we develop NASBOT, a Gaussian process based BO framework for neural architecture search. To accomplish this, we develop a distance metric in the space of neural network architectures which can be computed efficiently via an optimal transport program. This distance might be of independent interest to the deep learning community as it may find applications outside of BO. We demonstrate that NASBOT outperforms other alternatives for architecture search in several cross validation based model selection tasks on multi-layer perceptrons and convolutional neural networks.


Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks

arXiv.org Machine Learning

Autonomous driving presents one of the largest problems that the robotics and artificial intelligence communities are facing at the moment, both in terms of difficulty and potential societal impact. Self-driving vehicles (SDVs) are expected to prevent road accidents and save millions of lives while improving the livelihood and life quality of many more. However, despite large interest and a number of industry players working in the autonomous domain, there is still more to be done in order to develop a system capable of operating at a level comparable to best human drivers. One reason for this is high uncertainty of traffic behavior and large number of situations that an SDV may encounter on the roads, making it very difficult to create a fully generalizable system. To ensure safe and efficient operations, an autonomous vehicle is required to account for this uncertainty and to anticipate a multitude of possible behaviors of traffic actors in its surrounding. In this work, we address this critical problem and present a method to predict multiple possible trajectories of actors while also estimating their probabilities. The method encodes each actor's surrounding context into a raster image, used as input by deep convolutional networks to automatically derive relevant features for the task. Following extensive offline evaluation and comparison to state-of-the-art baselines, as well as closed course tests, the method was successfully deployed to a fleet of SDVs.