Diouane, Youssef
High-Dimensional Bayesian Optimization Using Both Random and Supervised Embeddings
Priem, Rémy, Diouane, Youssef, Bartoli, Nathalie, Dubreuil, Sylvain, Saves, Paul
Bayesian optimization (BO) is one of the most powerful strategies to solve computationally expensive-to-evaluate blackbox optimization problems. However, BO methods are conventionally used for optimization problems of small dimension because of the curse of dimensionality. In this paper, a high-dimensionnal optimization method incorporating linear embedding subspaces of small dimension is proposed to efficiently perform the optimization. An adaptive learning strategy for these linear embeddings is carried out in conjunction with the optimization. The resulting BO method, named efficient global optimization coupled with random and supervised embedding (EGORSE), combines in an adaptive way both random and supervised linear embeddings. EGORSE has been compared to state-of-the-art algorithms and tested on academic examples with a number of design variables ranging from 10 to 600. The obtained results show the high potential of EGORSE to solve high-dimensional blackbox optimization problems, in terms of both CPU time and the limited number of calls to the expensive blackbox simulation.
A Proximal Modified Quasi-Newton Method for Nonsmooth Regularized Optimization
Diouane, Youssef, Habiboullah, Mohamed Laghdaf, Orban, Dominique
We develop R2N, a modified quasi-Newton method for minimizing the sum of a $\mathcal{C}^1$ function $f$ and a lower semi-continuous prox-bounded $h$. Both $f$ and $h$ may be nonconvex. At each iteration, our method computes a step by minimizing the sum of a quadratic model of $f$, a model of $h$, and an adaptive quadratic regularization term. A step may be computed by a variant of the proximal-gradient method. An advantage of R2N over trust-region (TR) methods is that proximal operators do not involve an extra TR indicator. We also develop the variant R2DH, in which the model Hessian is diagonal, which allows us to compute a step without relying on a subproblem solver when $h$ is separable. R2DH can be used as standalone solver, but also as subproblem solver inside R2N. We describe non-monotone variants of both R2N and R2DH. Global convergence of a first-order stationarity measure to zero holds without relying on local Lipschitz continuity of $\nabla f$, while allowing model Hessians to grow unbounded, an assumption particularly relevant to quasi-Newton models. Under Lipschitz-continuity of $\nabla f$, we establish a tight worst-case complexity bound of $O(1 / \epsilon^{2/(1 - p)})$ to bring said measure below $\epsilon > 0$, where $0 \leq p < 1$ controls the growth of model Hessians. The latter must not diverge faster than $|\mathcal{S}_k|^p$, where $\mathcal{S}_k$ is the set of successful iterations up to iteration $k$. When $p = 1$, we establish the tight exponential complexity bound $O(\exp(c \epsilon^{-2}))$ where $c > 0$ is a constant. We describe our Julia implementation and report numerical experience on a basis-pursuit problem, image denoising, minimum-rank matrix completion, and a nonlinear support vector machine. In particular, the minimum-rank problem cannot be solved directly at this time by a TR approach as corresponding proximal operators are not known analytically.
A graph-structured distance for heterogeneous datasets with meta variables
Hallé-Hannan, Edward, Audet, Charles, Diouane, Youssef, Digabel, Sébastien Le, Saves, Paul
Heterogeneous datasets emerge in various machine learning or optimization applications that feature different data sources, various data types and complex relationships between variables. In practice, heterogeneous datasets are often partitioned into smaller well-behaved ones that are easier to process. However, some applications involve expensive-to-generate or limited size datasets, which motivates methods based on the whole dataset. The first main contribution of this work is a modeling graph-structured framework that generalizes state-of-the-art hierarchical, tree-structured, or variable-size frameworks. This framework models domains that involve heterogeneous datasets in which variables may be continuous, integer, or categorical, with some identified as meta if their values determine the inclusion/exclusion or affect the bounds of other so-called decreed variables. Excluded variables are introduced to manage variables that are either included or excluded depending on the given points. The second main contribution is the graph-structured distance that compares extended points with any combination of included and excluded variables: any pair of points can be compared, allowing to work directly in heterogeneous datasets with meta variables. The contributions are illustrated with some regression experiments, in which the performance of a multilayer perceptron with respect to its hyperparameters is modeled with inverse distance weighting and $K$-nearest neighbors models.
A general error analysis for randomized low-rank approximation with application to data assimilation
Di Perrotolo, Alexandre Scotto, Diouane, Youssef, Gürol, Selime, Vasseur, Xavier
Randomized algorithms have proven to perform well on a large class of numerical linear algebra problems. Their theoretical analysis is critical to provide guarantees on their behaviour, and in this sense, the stochastic analysis of the randomized low-rank approximation error plays a central role. Indeed, several randomized methods for the approximation of dominant eigen- or singular modes can be rewritten as low-rank approximation methods. However, despite the large variety of algorithms, the existing theoretical frameworks for their analysis rely on a specific structure for the covariance matrix that is not adapted to all the algorithms. We propose a general framework for the stochastic analysis of the low-rank approximation error in Frobenius norm for centered and non-standard Gaussian matrices. Under minimal assumptions on the covariance matrix, we derive accurate bounds both in expectation and probability. Our bounds have clear interpretations that enable us to derive properties and motivate practical choices for the covariance matrix resulting in efficient low-rank approximation algorithms. The most commonly used bounds in the literature have been demonstrated as a specific instance of the bounds proposed here, with the additional contribution of being tighter. Numerical experiments related to data assimilation further illustrate that exploiting the problem structure to select the covariance matrix improves the performance as suggested by our bounds.
SMT 2.0: A Surrogate Modeling Toolbox with a focus on Hierarchical and Mixed Variables Gaussian Processes
Saves, Paul, Lafage, Remi, Bartoli, Nathalie, Diouane, Youssef, Bussemaker, Jasper, Lefebvre, Thierry, Hwang, John T., Morlier, Joseph, Martins, Joaquim R. R. A.
The Surrogate Modeling Toolbox (SMT) is an open-source Python package that offers a collection of surrogate modeling methods, sampling techniques, and a set of sample problems. This paper presents SMT 2.0, a major new release of SMT that introduces significant upgrades and new features to the toolbox. This release adds the capability to handle mixed-variable surrogate models and hierarchical variables. These types of variables are becoming increasingly important in several surrogate modeling applications. SMT 2.0 also improves SMT by extending sampling methods, adding new surrogate models, and computing variance and kernel derivatives for Kriging. This release also includes new functions to handle noisy and use multifidelity data. To the best of our knowledge, SMT 2.0 is the first open-source surrogate library to propose surrogate models for hierarchical and mixed inputs. This open-source software is distributed under the New BSD license.
High-dimensional mixed-categorical Gaussian processes with application to multidisciplinary design optimization for a green aircraft
Saves, Paul, Diouane, Youssef, Bartoli, Nathalie, Lefebvre, Thierry, Morlier, Joseph
Multidisciplinary design optimization (MDO) methods aim at adapting numerical optimization techniques to the design of engineering systems involving multiple disciplines. In this context, a large number of mixed continuous, integer, and categorical variables might arise during the optimization process, and practical applications involve a significant number of design variables. Recently, there has been a growing interest in mixed-categorical metamodels based on Gaussian Process (GP) for Bayesian optimization. In particular, to handle mixed-categorical variables, several existing approaches employ different strategies to build the GP. These strategies either use continuous kernels, such as the continuous relaxation or the Gower distance-based kernels, or direct estimation of the correlation matrix, such as the exponential homoscedastic hypersphere (EHH) or the Homoscedastic Hypersphere (HH) kernel. Although the EHH and HH kernels are shown to be very efficient and lead to accurate GPs, they are based on a large number of hyperparameters. In this paper, we address this issue by constructing mixed-categorical GPs with fewer hyperparameters using Partial Least Squares (PLS) regression. Our goal is to generalize Kriging with PLS, commonly used for continuous inputs, to handle mixed-categorical inputs. The proposed method is implemented in the open-source software SMT and has been efficiently applied to structural and multidisciplinary applications. Our method is used to effectively demonstrate the structural behavior of a cantilever beam and facilitates MDO of a green aircraft, resulting in a 439-kilogram reduction in the amount of fuel consumed during a single aircraft mission.
TREGO: a Trust-Region Framework for Efficient Global Optimization
Diouane, Youssef, Picheny, Victor, Riche, Rodolphe Le, Di Perrotolo, Alexandre Scotto
Efficient Global Optimization (EGO) is the canonical form of Bayesian optimization that has been successfully applied to solve global optimization of expensive-to-evaluate black-box problems. However, EGO struggles to scale with dimension, and offers limited theoretical guarantees. In this work, we propose and analyze a trust-region-like EGO method (TREGO). TREGO alternates between regular EGO steps and local steps within a trust region. By following a classical scheme for the trust region (based on a sufficient decrease condition), we demonstrate that our algorithm enjoys strong global convergence properties, while departing from EGO only for a subset of optimization steps. Using extensive numerical experiments based on the well-known COCO benchmark, we first analyze the sensitivity of TREGO to its own parameters, then show that the resulting algorithm is consistently outperforming EGO and getting competitive with other state-of-the-art global optimization methods.
An efficient application of Bayesian optimization to an industrial MDO framework for aircraft design
Priem, Remy, Gagnon, Hugo, Chittick, Ian, Dufresne, Stephane, Diouane, Youssef, Bartoli, Nathalie
The multi-level, multi-disciplinary and multi-fidelity optimization framework developed at Bombardier Aviation has shown great results to explore efficient and competitive aircraft configurations. This optimization framework has been developed within the Isight software, the latter offers a set of ready-to-use optimizers. Unfortunately, the computational effort required by the Isight optimizers can be prohibitive with respect to the requirements of an industrial context. In this paper, a constrained Bayesian optimization optimizer, namely the super efficient global optimization with mixture of experts, is used to reduce the optimization computational effort. The obtained results showed significant improvements compared to two of the popular Isight optimizers. The capabilities of the tested constrained Bayesian optimization solver are demonstrated on Bombardier research aircraft configuration study cases.
Upper Trust Bound Feasibility Criterion for Mixed Constrained Bayesian Optimization with Application to Aircraft Design
Priem, Rémy, Bartoli, Nathalie, Diouane, Youssef, Sgueglia, Alessandro
Bayesian optimization methods have been successfully applied to black box optimization problems that are expensive to evaluate. In this paper, we adapt the so-called super effcient global optimization algorithm to solve more accurately mixed constrained problems. The proposed approach handles constraints by means of upper trust bound, the latter encourages exploration of the feasible domain by combining the mean prediction and the associated uncertainty function given by the Gaussian processes. On top of that, a refinement procedure, based on a learning rate criterion, is introduced to enhance the exploitation and exploration trade-off. We show the good potential of the approach on a set of numerical experiments. Finally, we present an application to conceptual aircraft configuration upon which we show the superiority of the proposed approach compared to a set of the state-of-the-art black box optimization solvers. Keywords: Global Optimization, Mixed Constrained Optimization, Black box optimization, Bayesian Optimization, Gaussian Process.