Adaptive Design of Experiments for Conservative Estimation of Excursion Sets Machine Learning

We consider a Gaussian process model trained on few evaluations of an expensive-to-evaluate deterministic function and we study the problem of estimating a fixed excursion set of this function. We review the concept of conservative estimates, recently introduced in this framework, and, in particular, we focus on estimates based on Vorob'ev quantiles. We present a method that sequentially selects new evaluations of the function in order to reduce the uncertainty on such estimates. The sequential strategies are first benchmarked on artificial test cases generated from Gaussian process realizations in two and five dimensions, and then applied to two reliability engineering test cases.

Gradient descent in Gaussian random fields as a toy model for high-dimensional optimisation in deep learning Machine Learning

In this paper we model the loss function of high-dimensional optimization problems by a Gaussian random field, or equivalently a Gaussian process. Our aim is to study gradient descent in such loss functions or energy landscapes and compare it to results obtained from real high-dimensional optimization problems such as encountered in deep learning. In particular, we analyze the distribution of the improved loss function after a step of gradient descent, provide analytic expressions for the moments as well as prove asymptotic normality as the dimension of the parameter space becomes large. Moreover, we compare this with the expectation of the global minimum of the landscape obtained by means of the Euler characteristic of excursion sets. Besides complementing our analytical findings with numerical results from simulated Gaussian random fields, we also compare it to loss functions obtained from optimisation problems on synthetic and real data sets by proposing a "black box" random field toy-model for a deep neural network loss function.

A new integral loss function for Bayesian optimization Machine Learning

We consider the problem of maximizing a real-valued continuous function $f$ using a Bayesian approach. Since the early work of Jonas Mockus and Antanas \v{Z}ilinskas in the 70's, the problem of optimization is usually formulated by considering the loss function $\max f - M_n$ (where $M_n$ denotes the best function value observed after $n$ evaluations of $f$). This loss function puts emphasis on the value of the maximum, at the expense of the location of the maximizer. In the special case of a one-step Bayes-optimal strategy, it leads to the classical Expected Improvement (EI) sampling criterion. This is a special case of a Stepwise Uncertainty Reduction (SUR) strategy, where the risk associated to a certain uncertainty measure (here, the expected loss) on the quantity of interest is minimized at each step of the algorithm. In this article, assuming that $f$ is defined over a measure space $(\mathbb{X}, \lambda)$, we propose to consider instead the integral loss function $\int_{\mathbb{X}} (f - M_n)_{+}\, d\lambda$, and we show that this leads, in the case of a Gaussian process prior, to a new numerically tractable sampling criterion that we call $\rm EI^2$ (for Expected Integrated Expected Improvement). A numerical experiment illustrates that a SUR strategy based on this new sampling criterion reduces the error on both the value and the location of the maximizer faster than the EI-based strategy.

Polynomial-Chaos-based Kriging Machine Learning

Computer simulation has become the standard tool in many engineering fields for designing and optimizing systems, as well as for assessing their reliability. To cope with demanding analysis such as optimization and reliability, surrogate models (a.k.a meta-models) have been increasingly investigated in the last decade. Polynomial Chaos Expansions (PCE) and Kriging are two popular non-intrusive meta-modelling techniques. PCE surrogates the computational model with a series of orthonormal polynomials in the input variables where polynomials are chosen in coherency with the probability distributions of those input variables. On the other hand, Kriging assumes that the computer model behaves as a realization of a Gaussian random process whose parameters are estimated from the available computer runs, i.e. input vectors and response values. These two techniques have been developed more or less in parallel so far with little interaction between the researchers in the two fields. In this paper, PC-Kriging is derived as a new non-intrusive meta-modeling approach combining PCE and Kriging. A sparse set of orthonormal polynomials (PCE) approximates the global behavior of the computational model whereas Kriging manages the local variability of the model output. An adaptive algorithm similar to the least angle regression algorithm determines the optimal sparse set of polynomials. PC-Kriging is validated on various benchmark analytical functions which are easy to sample for reference results. From the numerical investigations it is concluded that PC-Kriging performs better than or at least as good as the two distinct meta-modeling techniques. A larger gain in accuracy is obtained when the experimental design has a limited size, which is an asset when dealing with demanding computational models.

Batch simulations and uncertainty quantification in Gaussian process surrogate-based approximate Bayesian computation Machine Learning

Surrogate models such as Gaussian processes (GP) have been proposed to accelerate approximate Bayesian computation (ABC) when the statistical model of interest is expensive-to-simulate. In one such promising framework the discrepancy between simulated and observed data is modelled with a GP. So far principled strategies have been proposed only for sequential selection of the simulation locations. To address this limitation, we develop Bayesian optimal design strategies to parallellise the expensive simulations. Current surrogate-based ABC methods also produce only a point estimate of the ABC posterior while there can be substantial additional uncertainty due to the limited budget of simulations. We also address the problem of quantifying the uncertainty of ABC posterior and discuss the connections between our resulting framework called Bayesian ABC, Bayesian quadrature (BQ) and Bayesian optimisation (BO). Experiments with several toy and real-world simulation models demonstrate advantages of the proposed techniques.