Hierarchies of Reward Machines

Furelos-Blanco, Daniel, Law, Mark, Jonsson, Anders, Broda, Krysia, Russo, Alessandra

Jun-4-2023–arXiv.org Artificial Intelligence

Hierarchical reinforcement learning (HRL; Barto & Mahadevan, 2003) frameworks, such as options (Sutton et al., Reward machines (RMs) are a recent formalism 1999), have been used to exploit RMs by learning policies for representing the reward function of a reinforcement at two levels of abstraction: (i) select a formula (i.e., subgoal) learning task through a finite-state machine from a given RM state, and (ii) select an action to whose edges encode subgoals of the task using (eventually) satisfy the chosen formula (Toro Icarte et al., high-level events. The structure of RMs enables 2018; Furelos-Blanco et al., 2021). The subtask decomposition the decomposition of a task into simpler and independently powered by HRL enables learning at multiple scales solvable subtasks that help tackle longhorizon simultaneously, and eases the handling of long-horizon and and/or sparse reward tasks. We propose sparse reward tasks. In addition, several works have considered a formalism for further abstracting the subtask the problem of learning the RMs themselves from structure by endowing an RM with the ability to interaction (e.g., Toro Icarte et al., 2019; Xu et al., 2020; call other RMs, thus composing a hierarchy of Furelos-Blanco et al., 2021; Hasanbeig et al., 2021).

hrm, logic & formal reasoning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

Jun-4-2023

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom
  - England (0.14)
- North America > United States
  - Hawaii (0.14)

Genre:
- Research Report (0.63)
- Workflow (0.67)

Industry:
- Education (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks (1.00)
    - Reinforcement Learning (0.88)
  - Representation & Reasoning
    - Agents (1.00)
    - Logic & Formal Reasoning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found