Källström, Johan
Utility-Based Reinforcement Learning: Unifying Single-objective and Multi-objective Reinforcement Learning
Vamplew, Peter, Foale, Cameron, Hayes, Conor F., Mannion, Patrick, Howley, Enda, Dazeley, Richard, Johnson, Scott, Källström, Johan, Ramos, Gabriel, Rădulescu, Roxana, Röpke, Willem, Roijers, Diederik M.
So far the flow of knowledge has primarily been from conventional single-objective RL (SORL) into MORL, with algorithmic Research in multi-objective reinforcement learning(MORL) has introduced innovations from SORL being adapted to the context of multiple the utility-based paradigm, which makes use of both environmental objectives [2, 6, 22, 34]. This paper runs counter to that trend, rewards and a function that defines the utility derived as we will argue that the utility-based paradigm which has been bytheuser from thoserewards. Inthis paperweextend this paradigm widely adopted in MORL [5, 13, 21], has both relevance and benefits to the context of single-objective reinforcement learning(RL), to SORL. We present a general framework for utility-based RL and outline multiple potential benefits including the ability to perform (UBRL), which unifies the SORL and MORL frameworks, and discuss multi-policy learning across tasks relating to uncertain objectives, benefits and potential applications of this for single-objective risk-aware RL, discounting, and safe RL. We also examine problems - in particular focusing on the novel potential UBRL offers the algorithmic implications of adopting a utility-based approach.
A Practical Guide to Multi-Objective Reinforcement Learning and Planning
Hayes, Conor F., Rădulescu, Roxana, Bargiacchi, Eugenio, Källström, Johan, Macfarlane, Matthew, Reymond, Mathieu, Verstraeten, Timothy, Zintgraf, Luisa M., Dazeley, Richard, Heintz, Fredrik, Howley, Enda, Irissappane, Athirai A., Mannion, Patrick, Nowé, Ann, Ramos, Gabriel, Restelli, Marcello, Vamplew, Peter, Roijers, Diederik M.
Real-world decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination. Such approaches may oversimplify the underlying problem and hence produce suboptimal results. This paper serves as a guide to the application of multi-objective methods to difficult problems, and is aimed at researchers who are already familiar with single-objective reinforcement learning and planning methods who wish to adopt a multi-objective perspective on their research, as well as practitioners who encounter multi-objective decision problems in practice. It identifies the factors that may influence the nature of the desired solution, and illustrates by example how these influence the design of multi-objective decision-making systems for complex problems.