Wagstaff, Ed, Hamid, Saad, Osborne, Michael

Integration over non-negative integrands is a central problem in machine learning (e.g. for model averaging, (hyper-)parameter marginalisation, and computing posterior predictive distributions). Bayesian Quadrature is a probabilistic numerical integration technique that performs promisingly when compared to traditional Markov Chain Monte Carlo methods. However, in contrast to easily-parallelised MCMC methods, Bayesian Quadrature methods have, thus far, been essentially serial in nature, selecting a single point to sample at each step of the algorithm. We deliver methods to select batches of points at each step, based upon those recently presented in the Batch Bayesian Optimisation literature. Such parallelisation significantly reduces computation time, especially when the integrand is expensive to sample.

Tiao, Louis C., Klein, Aaron, Archambeau, Cedric, Seeger, Matthias

We introduce a model-based asynchronous multi-fidelity hyperparameter optimization (HPO) method, combining strengths of asynchronous Hyperband and Gaussian process-based Bayesian optimization. Our method obtains substantial speed-ups in wall-clock time over, both, synchronous and asynchronous Hyperband, as well as a prior model-based extension of the former. Candidate hyperparameters to evaluate are selected by a novel jointly dependent Gaussian process-based surrogate model over all resource levels, allowing evaluations at one level to be informed by evaluations gathered at all others. We benchmark several covariance functions and conduct extensive experiments on hyperparameter tuning for multi-layer perceptrons on tabular data, convolutional networks on image classification, and recurrent networks on language modelling, demonstrating the benefits of our approach.

In recent years, leveraging parallel and distributed computational resources has become essential to solve problems of high computational cost. Bayesian optimization (BO) has shown attractive results in those expensive-to-evaluate problems such as hyperparameter optimization of machine learning algorithms. While many parallel BO methods have been developed to search efficiently utilizing these computational resources, these methods assumed synchronous settings or were not scalable. In this paper, we propose a simple and scalable BO method for asynchronous parallel settings. Experiments are carried out with a benchmark function and hyperparameter optimization of multi-layer perceptrons, which demonstrate the promising performance of the proposed method.

De Ath, George, Everson, Richard M., Fieldsend, Jonathan E., Rahat, Alma A. M.

Bayesian optimisation is a popular, surrogate model-based approach for optimising expensive black-box functions. Given a surrogate model, the next location to expensively evaluate is chosen via maximisation of a cheap-to-query acquisition function. We present an $\epsilon$-greedy procedure for Bayesian optimisation in batch settings in which the black-box function can be evaluated multiple times in parallel. Our $\epsilon$-shotgun algorithm leverages the model's prediction, uncertainty, and the approximated rate of change of the landscape to determine the spread of batch solutions to be distributed around a putative location. The initial target location is selected either in an exploitative fashion on the mean prediction, or -- with probability $\epsilon$ -- from elsewhere in the design space. This results in locations that are more densely sampled in regions where the function is changing rapidly and in locations predicted to be good (i.e close to predicted optima), with more scattered samples in regions where the function is flatter and/or of poorer quality. We empirically evaluate the $\epsilon$-shotgun methods on a range of synthetic functions and two real-world problems, finding that they perform at least as well as state-of-the-art batch methods and in many cases exceed their performance.

Tran, Anh, McCann, Scott, Furlan, John M., Pagalthivarthi, Krishnan V., Visintainer, Robert J., Wildey, Tim

High-fidelity complex engineering simulations are highly predictive, but also computationally expensive and often require substantial computational efforts. The mitigation of computational burden is usually enabled through parallelism in high-performance cluster (HPC) architecture. In this paper, an asynchronous constrained batch-parallel Bayesian optimization method is proposed to efficiently solve the computationally-expensive simulation-based optimization problems on the HPC platform, with a budgeted computational resource, where the maximum number of simulations is a constant. The advantages of this method are three-fold. First, the efficiency of the Bayesian optimization is improved, where multiple input locations are evaluated massively parallel in an asynchronous manner to accelerate the optimization convergence with respect to physical runtime. This efficiency feature is further improved so that when each of the inputs is finished, another input is queried without waiting for the whole batch to complete. Second, the method can handle both known and unknown constraints. Third, the proposed method considers several acquisition functions at the same time and sample based on an evolving probability mass distribution function using GP-Hedge scheme, where parameters are corresponding to the performance of each acquisition function. The proposed framework is termed aphBO-2GP-3B, which corresponds to asynchronous parallel hedge Bayesian optimization with two Gaussian processes and three batches. The aphBO-2GP-3B framework is demonstrated using two high-fidelity expensive industrial applications, where the first one is based on finite element analysis (FEA) and the second one is based on computational fluid dynamics (CFD) simulations.