Picheny, Victor

Bayesian optimization under mixed constraints with a slack-variable augmented Lagrangian

Neural Information Processing Systems

An augmented Lagrangian (AL) can convert a constrained optimization problem into a sequence of simpler (e.g., unconstrained) problems which are then usually solved with local solvers. Recently, surrogate-based Bayesian optimization (BO) sub-solvers have been successfully deployed in the AL framework for a more global search in the presence of inequality constraints; however a drawback was that expected improvement (EI) evaluations relied on Monte Carlo. Here we introduce an alternative slack variable AL, and show that in this formulation the EI may be evaluated with library routines. The slack variables furthermore facilitate equality as well as inequality constraints, and mixtures thereof. We show our new slack "ALBO" compares favorably to the original.

Bayesian Quantile and Expectile Optimisation

arXiv.org Machine Learning

Bayesian optimisation is widely used to optimise stochastic black box functions. While most strategies are focused on optimising conditional expectations, a large variety of applications require risk-averse decisions and alternative criteria accounting for the distribution tails need to be considered. In this paper, we propose new variational models for Bayesian quantile and expectile regression that are well-suited for heteroscedastic settings. Our models consist of two latent Gaussian processes accounting respectively for the conditional quantile (or expectile) and variance that are chained through asymmetric likelihood functions. Furthermore, we propose two Bayesian optimisation strategies, either derived from a GP-UCB or Thompson sampling, that are tailored to such models and that can accommodate large batches of points. As illustrated in the experimental section, the proposed approach clearly outperforms the state of the art.

Ordinal Bayesian Optimisation

arXiv.org Machine Learning

In BO, nonparametric Gaussian processes (GPs) provide flexible and fast-to-evaluate surrogates of the objective functions. Sequential design decisions, so-called acquisitions, judiciously balance exploration and exploitation in search for global optima, leveraging the uncertainty estimates provided by the GP posterior distributions (see Mockus et al. (1978); Jones et al. (1998) for early works or Shahriari et al. (2015) for a recent review). One of the weaknesses of vanilla BO lies in the underlying assumption that the objective function is a realisation of a GP: when this assumption is strongly violated, the GP model is weakly predictive and BO becomes inefficient. Two classical examples where BO fails are ill-conditioned problems, when the objective function has strong variations on the domain boundaries but is very flat in its central region (or conversely), and non-Lipschitz objectives, for instance with local discontinuities. High conditioning is typical in "exploratory" optimisation problems, when the parameter space is initially chosen very large. Discontinuities are frequent in computational fluid dynamics problems for instance, where a small change in the parameters results in a change of physics (e.g.

Modeling and Optimization with Gaussian Processes in Reduced Eigenbases -- Extended Version

arXiv.org Machine Learning

Parametric shape optimization aims at minimizing an objective function f(x) where x are CAD parameters. This task is difficult when f is the output of an expensive-to-evaluate numerical simulator and the number of CAD parameters is large. Most often, the set of all considered CAD shapes resides in a manifold of lower effective dimension in which it is preferable to build the surrogate model and perform the optimization. In this work, we uncover the manifold through a high-dimensional shape mapping and build a new coordinate system made of eigenshapes. The surrogate model is learned in the space of eigenshapes: a regularized likelihood maximization provides the most relevant dimensions for the output. The final surrogate model is detailed (anisotropic) with respect to the most sensitive eigenshapes and rough (isotropic) in the remaining dimensions. Last, the optimization is carried out with a focus on the critical dimensions, the remaining ones being coarsely optimized through a random embedding and the manifold being accounted for through a replication strategy. At low budgets, the methodology leads to a more accurate model and a faster optimization than the classical approach of directly working with the CAD parameters.

X-Armed Bandits: Optimizing Quantiles and Other Risks

arXiv.org Machine Learning

We propose and analyze StoROO, an algorithm for risk optimization on stochastic black-box functions derived from StoOO. Motivated by risk-averse decision making fields like agriculture, medicine, biology or finance, we do not focus on the mean payoff but on generic functionals of the return distribution, like for example quantiles. We provide a generic regret analysis of StoROO. Inspired by the bandit literature and black-box mean optimizers, StoROO relies on the possibility to construct confidence intervals for the targeted functional based on random-size samples. We explain in detail how to construct them for quantiles, providing tight bounds based on Kullback-Leibler divergence. The interest of these tight bounds is highlighted by numerical experiments that show a dramatic improvement over standard approaches.

The Kalai-Smorodinski solution for many-objective Bayesian optimization

arXiv.org Machine Learning

An ongoing aim of research in multiobjective Bayesian optimization is to extend its applicability to a large number of objectives. While coping with a limited budget of evaluations, recovering the set of optimal compromise solutions generally requires numerous observations and is less interpretable since this set tends to grow larger with the number of objectives. We thus propose to focus on a specific solution originating from game theory, the Kalai-Smorodinsky solution, which possesses attractive properties. In particular, it ensures equal marginal gains over all objectives. We further make it insensitive to a monotonic transformation of the objectives by considering the objectives in the copula space. A novel tailored algorithm is proposed to search for the solution, in the form of a Bayesian optimization algorithm: sequential sampling decisions are made based on acquisition functions that derive from an instrumental Gaussian process prior. Our approach is tested on three problems with respectively four, six, and ten objectives. The method is available in the package GPGame available on CRAN at https://cran.r-project.org/package=GPGame.

A Review on Quantile Regression for Stochastic Computer Experiments

arXiv.org Machine Learning

We report on an empirical study of the main strategies for conditional quantile estimation in the context of stochastic computer experiments. To ensure adequate diversity, six metamodels are presented, divided into three categories based on order statistics, functional approaches, and those of Bayesian inspiration. The metamodels are tested on several problems characterized by the size of the training set, the input dimension, the quantile order and the value of the probability density function in the neighborhood of the quantile. The metamodels studied reveal good contrasts in our set of 480 experiments, enabling several patterns to be extracted. Based on our results, guidelines are proposed to allow users to select the best method for a given problem.

Targeting Solutions in Bayesian Multi-Objective Optimization: Sequential and Parallel Versions

arXiv.org Machine Learning

Multi-objective optimization aims at finding trade-off solutions to conflicting objectives. These constitute the Pareto optimal set. In the context of expensive-to-evaluate functions, it is impossible and often non-informative to look for the entire set. As an end-user would typically prefer a certain part of the objective space, we modify the Bayesian multi-objective optimization algorithm which uses Gaussian Processes to maximize the Expected Hypervolume Improvement, to focus the search in the preferred region. The cumulated effects of the Gaussian Processes and the targeting strategy lead to a particularly efficient convergence to the desired part of the Pareto set. To take advantage of parallel computing, a multi-point extension of the targeting criterion is proposed and analyzed.

Budgeted Multi-Objective Optimization with a Focus on the Central Part of the Pareto Front - Extended Version

arXiv.org Machine Learning

Optimizing nonlinear systems involving expensive (computer) experiments with regard to conflicting objectives is a common challenge. When the number of experiments is severely restricted and/or when the number of objectives increases, uncovering the whole set of optimal solutions (the Pareto front) is out of reach, even for surrogate-based approaches. As non-compromising Pareto optimal solutions have usually little point in applications, this work restricts the search to relevant solutions that are close to the Pareto front center. The article starts by characterizing this center. Next, a Bayesian multi-objective optimization method for directing the search towards it is proposed. A criterion for detecting convergence to the center is described. If the criterion is triggered, a widened central part of the Pareto front is targeted such that sufficiently accurate convergence to it is forecasted within the remaining budget. Numerical experiments show how the resulting algorithm, C-EHI, better locates the central part of the Pareto front when compared to state-of-the-art Bayesian algorithms.

A Bayesian optimization approach to find Nash equilibria

arXiv.org Machine Learning

Game theory finds nowadays a broad range of applications in engineering and machine learning. However, in a derivative-free, expensive black-box context, very few algorithmic solutions are available to find game equilibria. Here, we propose a novel Gaussian-process based approach for solving games in this context. We follow a classical Bayesian optimization framework, with sequential sampling decisions based on acquisition functions. Two strategies are proposed, based either on the probability of achieving equilibrium or on the Stepwise Uncertainty Reduction paradigm. Practical and numerical aspects are discussed in order to enhance the scalability and reduce computation time. Our approach is evaluated on several synthetic game problems with varying number of players and decision space dimensions. We show that equilibria can be found reliably for a fraction of the cost (in terms of black-box evaluations) compared to classical, derivative-based algorithms. The method is available in the R package GPGame available on CRAN at https://cran.r-project.org/package=GPGame.