Student-$t$ processes have recently been proposed as an appealing alternative non-parameteric function prior. They feature enhanced flexibility and predictive variance. In this work the use of Student-$t$ processes are explored for multi-objective Bayesian optimization. In particular, an analytical expression for the hypervolume-based probability of improvement is developed for independent Student-$t$ process priors of the objectives. Its effectiveness is shown on a multi-objective optimization problem which is known to be difficult with traditional Gaussian processes.
The content ranking problem in a social news website is typically a function that maximizes a scalar metric like dwell-time. However, in most real-world applications we are interested in more than one metric — for instance, simultaneously maximizing click-through rate, monetization metrics, and dwell-time — while also satisfying the constraints from traffic requirements promised to different publishers. The solution needs to be an online algorithm since the data arrives serially. Additionally, the objective function and the constraints can dynamically change. In this paper, we formulate this problem as a constrained, dynamic, multi-objective optimization problem. We propose a novel framework that extends a successful genetic optimization algorithm, NSGA-II, to solve our ranking problem. We evaluate optimization performance using the Hypervolume metric. We demonstrate the application of our framework on a real-world article ranking problem from the Yahoo! News page. We observe that our proposed solution makes considerable improvements in both time and performance over a brute-force baseline technique that is currently in production.
Much of the focus in machine learning research is placed in creating new architectures and optimization methods, but the overall loss function is seldom questioned. This paper interprets machine learning from a multi-objective optimization perspective, showing the limitations of the default linear combination of loss functions over a data set and introducing the hypervolume indicator as an alternative. It is shown that the gradient of the hypervolume is defined by a self-adjusting weighted mean of the individual loss gradients, making it similar to the gradient of a weighted mean loss but without requiring the weights to be defined a priori. This enables an inner boosting-like behavior, where the current model is used to automatically place higher weights on samples with higher losses but without requiring the use of multiple models. Results on a denoising autoencoder show that the new formulation is able to achieve better mean loss than the direct optimization of the mean loss, providing evidence to the conjecture that self-adjusting the weights creates a smoother loss surface.
Bringmann, Karl (Max Planck Institute for Informatic) | Friedrich, Tobias (Max Planck Institute for Informatic) | Neumann, Frank (The University of Adelaide) | Wagner, Markus (The University of Adelaide)
Multi-objective optimization problems arise frequently in applications but can often only be solved approximately by heuristic approaches. Evolutionary algorithms have been widely used to tackle multi-objective problems. These algorithms use different measures to ensure diversity in the objective space but are not guided by a formal notion of approximation. We present a new framework of an evolutionary algorithm for multi-objective optimization that allows to work with a formal notion of approximation. Our experimental results show that our approach outperforms state-of-the-art evolutionary algorithms in terms of the quality of the approximation that is obtained in particular for problems with many objectives.
Optimizing nonlinear systems involving expensive (computer) experiments with regard to conflicting objectives is a common challenge. When the number of experiments is severely restricted and/or when the number of objectives increases, uncovering the whole set of optimal solutions (the Pareto front) is out of reach, even for surrogate-based approaches. As non-compromising Pareto optimal solutions have usually little point in applications, this work restricts the search to relevant solutions that are close to the Pareto front center. The article starts by characterizing this center. Next, a Bayesian multi-objective optimization method for directing the search towards it is proposed. A criterion for detecting convergence to the center is described. If the criterion is triggered, a widened central part of the Pareto front is targeted such that sufficiently accurate convergence to it is forecasted within the remaining budget. Numerical experiments show how the resulting algorithm, C-EHI, better locates the central part of the Pareto front when compared to state-of-the-art Bayesian algorithms.