A Survey of Multi-Objective Sequential Decision-Making

Journal of Artificial Intelligence Research

Sequential decision-making problems with multiple objectives arise naturally in practice and pose unique challenges for research in decision-theoretic planning and learning, which has largely focused on single-objective settings. This article surveys algorithms designed for sequential decision-making problems with multiple objectives. Though there is a growing body of literature on this subject, little of it makes explicit under what circumstances special methods are needed to solve multi-objective problems. Therefore, we identify three distinct scenarios in which converting such a problem to a single-objective one is impossible, infeasible, or undesirable. Furthermore, we propose a taxonomy that classifies multi-objective methods according to the applicable scenario, the nature of the scalarization function (which projects multi-objective values to scalar ones), and the type of policies considered. We show how these factors determine the nature of an optimal solution, which can be a single policy, a convex hull, or a Pareto front. Using this taxonomy, we survey the literature on multi-objective methods for planning and learning. Finally, we discuss key applications of such methods and outline opportunities for future work.

Multi-node environment strategy for Parallel Deterministic Multi-Objective Fractal Decomposition

arXiv.org Artificial Intelligence

This paper deals with these problems by using a new decomposition-based algorithm called: "Fractal geometric decomposition base algorithm" (FDA). It is a deterministic metaheuristic developed to solve large-scale continuous optimization problems [5]. It can be noticed, that we call large scale problems those having the dimension greater than 1000. In this research, we are interested in using FDA to deal with MOPs because in the literature decomposition based algorithms have been with more less success applied to solve these problems, their main problem is related to their complexity. In this work, the goal is to deal with this complexity problem by keeping the same level of efficiency. FDA is based on "divide-and-conquer" paradigm where the sub-regions are hyperspheres rather than hypercubes on classical approaches. In order to identify the Pareto optimal solutions, we propose to extend FDA using the scalarization approach. We called the proposed algorithm Mo-FDA.


AAAI Conferences

In this paper an extension of the Network Security Games (NSG) is presented, that aims to incorporate the advantages of "standard" expert-based security risk assessment procedures and provide proper formalisation for general large-scale infrastructure protection problems. An instantiation procedure of the model is proposed, which is grounded on the classical security risk assessment methodologies, building a bridge between general standards and Game Theory Security models. The security control selection problem is modelled as a multi-objective optimisation problem. Two interwoven models are developed for addressing the security risk assessment problem. The asset model describes the system and its parameters, while the attack model is used to formalise possible threat scenarios. A specific solver for the stated multi-objective optimisation problem is described in details with theoretically grounded justification of its' correctness. Proposed model is instantiated for an airport case study, and the essential building blocks of the methodology are discussed. The work reported in this paper shows the feasibility of a generalised mathematically founded approach to security risk assessment in large-scale system engineering.

Targeting Solutions in Bayesian Multi-Objective Optimization: Sequential and Parallel Versions

arXiv.org Machine Learning

Multi-objective optimization aims at finding trade-off solutions to conflicting objectives. These constitute the Pareto optimal set. In the context of expensive-to-evaluate functions, it is impossible and often non-informative to look for the entire set. As an end-user would typically prefer a certain part of the objective space, we modify the Bayesian multi-objective optimization algorithm which uses Gaussian Processes to maximize the Expected Hypervolume Improvement, to focus the search in the preferred region. The cumulated effects of the Gaussian Processes and the targeting strategy lead to a particularly efficient convergence to the desired part of the Pareto set. To take advantage of parallel computing, a multi-point extension of the targeting criterion is proposed and analyzed.

Online Article Ranking as a Constrained, Dynamic, Multi-Objective Optimization Problem

AAAI Conferences

The content ranking problem in a social news website is typically a function that maximizes a scalar metric like dwell-time. However, in most real-world applications we are interested in more than one metric — for instance, simultaneously maximizing click-through rate, monetization metrics, and dwell-time — while also satisfying the constraints from traffic requirements promised to different publishers. The solution needs to be an online algorithm since the data arrives serially. Additionally, the objective function and the constraints can dynamically change. In this paper, we formulate this problem as a constrained, dynamic, multi-objective optimization problem. We propose a novel framework that extends a successful genetic optimization algorithm, NSGA-II, to solve our ranking problem. We evaluate optimization performance using the Hypervolume metric. We demonstrate the application of our framework on a real-world article ranking problem from the Yahoo! News page. We observe that our proposed solution makes considerable improvements in both time and performance over a brute-force baseline technique that is currently in production.