Revisiting Multi-Objective MDPs with Relaxed Lexicographic Preferences
Pineda, Luis Enrique (University of Massachusetts Amherst) | Wray, Kyle Hollins (University of Massachusetts Amherst) | Zilberstein, Shlomo (University of Massachusetts Amherst)
We consider stochastic planning problems that involve multiple objectives such as minimizing task completion time and energy consumption. These problems can be modeled as multi-objective Markov decision processes (MOMDPs), an extension of the widely-used MDP model to handle problems involving multiple value functions. We focus on a subclass of MOMDPs in which the objectives have a {\em relaxed lexicographic structure}, allowing an agent to seek improvement in a lower-priority objective when the impact on a higher-priority objective is within some small given tolerance. We examine the relationship between this class of problems and {\em constrained MDPs}, showing that the latter offer an alternative solution method with strong guarantees. We show empirically that a recently introduced algorithm for MOMDPs may not offer the same strong guarantees, but it does perform well in practice.
Nov-1-2015