Utility-Based Reinforcement Learning: Unifying Single-objective and Multi-objective Reinforcement Learning

Vamplew, Peter, Foale, Cameron, Hayes, Conor F., Mannion, Patrick, Howley, Enda, Dazeley, Richard, Johnson, Scott, Källström, Johan, Ramos, Gabriel, Rădulescu, Roxana, Röpke, Willem, Roijers, Diederik M.

Feb-4-2024–arXiv.org Artificial Intelligence

So far the flow of knowledge has primarily been from conventional single-objective RL (SORL) into MORL, with algorithmic Research in multi-objective reinforcement learning(MORL) has introduced innovations from SORL being adapted to the context of multiple the utility-based paradigm, which makes use of both environmental objectives [2, 6, 22, 34]. This paper runs counter to that trend, rewards and a function that defines the utility derived as we will argue that the utility-based paradigm which has been bytheuser from thoserewards. Inthis paperweextend this paradigm widely adopted in MORL [5, 13, 21], has both relevance and benefits to the context of single-objective reinforcement learning(RL), to SORL. We present a general framework for utility-based RL and outline multiple potential benefits including the ability to perform (UBRL), which unifies the SORL and MORL frameworks, and discuss multi-policy learning across tasks relating to uncertain objectives, benefits and potential applications of this for single-objective risk-aware RL, discounting, and safe RL. We also examine problems - in particular focusing on the novel potential UBRL offers the algorithmic implications of adopting a utility-based approach.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

Feb-4-2024

arXiv.org PDF

Add feedback

Country:
- Europe
  - Belgium (0.14)
  - Ireland (0.14)
  - Sweden (0.14)
- Oceania > New Zealand (0.15)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)