Collaborating Authors

Automating Bayesian optimization with Bayesian optimization

Neural Information Processing Systems

Bayesian optimization is a powerful tool for global optimization of expensive functions. One of its key components is the underlying probabilistic model used for the objective function f. In practice, however, it is often unclear how one should appropriately choose a model, especially when gathering data is expensive. In this work, we introduce a novel automated Bayesian optimization approach that dynamically selects promising models for explaining the observed data using Bayesian Optimization in the model space. Crucially, we account for the uncertainty in the choice of model; our method is capable of using multiple models to represent its current belief about f and subsequently using this information for decision making.

Bayesian Optimization with Exponential Convergence

Neural Information Processing Systems

This paper presents a Bayesian optimization method with exponential convergence without the need of auxiliary optimization and without the delta-cover sampling. Most Bayesian optimization methods require auxiliary optimization: an additional non-convex global optimization problem, which can be time-consuming and hard to implement in practice. Also, the existing Bayesian optimization method with exponential convergence requires access to the delta-cover sampling, which was considered to be impractical. Our approach eliminates both requirements and achieves an exponential convergence rate. Papers published at the Neural Information Processing Systems Conference.

Lookahead Bayesian Optimization with Inequality Constraints

Neural Information Processing Systems

We consider the task of optimizing an objective function subject to inequality constraints when both the objective and the constraints are expensive to evaluate. Bayesian optimization (BO) is a popular way to tackle optimization problems with expensive objective function evaluations, but has mostly been applied to unconstrained problems. Several BO approaches have been proposed to address expensive constraints but are limited to greedy strategies maximizing immediate reward. To address this limitation, we propose a lookahead approach that selects the next evaluation in order to maximize the long-term feasible reduction of the objective function. We present numerical experiments demonstrating the performance improvements of such a lookahead approach compared to several greedy BO algorithms, including constrained expected improvement (EIC) and predictive entropy search with constraint (PESC).

Bayesian Optimization in a Billion Dimensions via Random Embeddings

Journal of Artificial Intelligence Research

Bayesian optimization techniques have been successfully applied to robotics, planning, sensor placement, recommendation, advertising, intelligent user interfaces and automatic algorithm configuration. Despite these successes, the approach is restricted to problems of moderate dimension, and several workshops on Bayesian optimization have identified its scaling to high-dimensions as one of the holy grails of the field. In this paper, we introduce a novel random embedding idea to attack this problem. The resulting Random EMbedding Bayesian Optimization (REMBO) algorithm is very simple, has important invariance properties, and applies to domains with both categorical and continuous variables. We present a thorough theoretical analysis of REMBO. Empirical results confirm that REMBO can effectively solve problems with billions of dimensions, provided the intrinsic dimensionality is low. They also show that REMBO achieves state-of-the-art performance in optimizing the 47 discrete parameters of a popular mixed integer linear programming solver.

Bayesian Optimization with Gradients

Neural Information Processing Systems

Bayesian optimization has shown success in global optimization of expensive-to-evaluate multimodal objective functions. However, unlike most optimization methods, Bayesian optimization typically does not use derivative information. In this paper we show how Bayesian optimization can exploit derivative information to find good solutions with fewer objective function evaluations. In particular, we develop a novel Bayesian optimization algorithm, the derivative-enabled knowledge-gradient (dKG), which is one-step Bayes-optimal, asymptotically consistent, and provides greater one-step value of information than in the derivative-free setting. We also compute the dKG acquisition function and its gradient using a novel fast discretization-free technique.