Collaborating Authors

Information Condensing Active Learning Machine Learning

We introduce Information Condensing Active Learning (ICAL), a batch mode model agnostic Active Learning (AL) method targeted at Deep Bayesian Active Learning that focuses on acquiring labels for points which have as much information as possible about the still unacquired points. ICAL uses the Hilbert Schmidt Independence Criterion (HSIC) to measure the strength of the dependency between a candidate batch of points and the unlabeled set. We develop key optimizations that allow us to scale our method to large unlabeled sets. We show significant improvements in terms of model accuracy and negative log likelihood (NLL) on several image datasets compared to state of the art batch mode AL methods for deep learning.

Preferential Batch Bayesian Optimization Machine Learning

Most research in Bayesian optimization (BO) has focused on direct feedback scenarios, where one has access to exact, or perturbed, values of some expensive-to-evaluate objective. This direction has been mainly driven by the use of BO in machine learning hyper-parameter configuration problems. However, in domains such as modelling human preferences, A/B tests or recommender systems, there is a need of methods that are able to replace direct feedback with preferential feedback, obtained via rankings or pairwise comparisons. In this work, we present Preferential Batch Bayesian Optimization (PBBO), a new framework that allows to find the optimum of a latent function of interest, given any type of parallel preferential feedback for a group of two or more points. We do so by using a Gaussian process model with a likelihood specially designed to enable parallel and efficient data collection mechanisms, which are key in modern machine learning. We show how the acquisitions developed under this framework generalize and augment previous approaches in Bayesian optimization, expanding the use of these techniques to a wider range of domains. An extensive simulation study shows the benefits of this approach, both with simulated functions and four real data sets.

Distributionally Ambiguous Optimization Techniques in Batch Bayesian Optimization Machine Learning

We propose a novel, theoretically-grounded, acquisition function for batch Bayesian optimization informed by insights from distributionally ambiguous optimization. Our acquisition function is a lower bound on the well-known Expected Improvement function -- which requires a multi-dimensional Gaussian Expectation over a piecewise affine function -- and is computed by evaluating instead the best-case expectation over all probability distributions consistent with the same mean and variance as the original Gaussian distribution. Unlike alternative approaches including Expected Improvement, our proposed acquisition function avoids multi-dimensional integrations entirely, and can be computed exactly as the solution of a convex optimization problem in the form of a tractable semidefinite program (SDP). Moreover, we prove that the solution of this SDP also yields exact numerical derivatives, which enable efficient optimization of the acquisition function. Finally, it efficiently handles marginalized posteriors with respect to the Gaussian Process' hyperparameters. We demonstrate superior performance to heuristic alternatives and approximations of the intractable expected improvement, justifying this performance difference based on simple examples that break the assumptions of state-of-the-art methods.

Designing over uncertain outcomes with stochastic sampling Bayesian optimization Machine Learning

Optimization is becoming increasingly common in scientific and engineering domains. Oftentimes, these problems involve various levels of stochasticity or uncertainty in generating proposed solutions. Therefore, optimization in these scenarios must consider this stochasticity to properly guide the design of future experiments. Here, we adapt Bayesian optimization to handle uncertain outcomes, proposing a new framework called stochastic sampling Bayesian optimization (SSBO). We show that the bounds on expected regret for an upper confidence bound search in SSBO resemble those of earlier Bayesian optimization approaches, with added penalties due to the stochastic generation of inputs. Additionally, we adapt existing batch optimization techniques to properly limit the myopic decision making that can arise when selecting multiple instances before feedback. Finally, we show that SSBO techniques properly optimize a set of standard optimization problems as well as an applied problem inspired by bioengineering.

Sampling Acquisition Functions for Batch Bayesian Optimization Machine Learning

This paper presents Acquisition Thompson Sampling (ATS), a novel algorithm for batch Bayesian Optimization (BO) based on the idea of sampling multiple acquisition functions from a stochastic process. We define this process through the dependency of the acquisition functions on a set of model parameters. ATS is conceptually simple, straightforward to implement and, unlike other batch BO methods, it can be employed to parallelize any sequential acquisition function. In order to improve performance for multi-modal tasks, we show that ATS can be combined with existing techniques in order to realize different explore-exploit trade-offs and take into account pending function evaluations. We present experiments on a variety of benchmark functions and on the hyper-parameter optimization of a popular gradient boosting tree algorithm. These demonstrate the competitiveness of our algorithm with two state-of-the-art batch BO methods, and its advantages to classical parallel Thompson Sampling for BO.