Synthesis of Hierarchical Controllers Based on Deep Reinforcement Learning Policies
Delgrange, Florent, Avni, Guy, Lukina, Anna, Schilling, Christian, Nowé, Ann, Pérez, Guillermo A.
–arXiv.org Artificial Intelligence
We propose a novel approach to the problem of controller design for environments modeled as Markov decision processes (MDPs). Specifically, we consider a hierarchical MDP a graph with each vertex populated by an MDP called a "room." We first apply deep reinforcement learning (DRL) to obtain low-level policies for each room, scaling to large rooms of unknown structure. We then apply reactive synthesis to obtain a high-level planner that chooses which low-level policy to execute in each room. The central challenge in synthesizing the planner is the need for modeling rooms. We address this challenge by developing a DRL procedure to train concise "latent" policies together with PAC guarantees on their performance. Unlike previous approaches, ours circumvents a model distillation step. Our approach combats sparse rewards in DRL and enables reusability of low-level policies. We demonstrate feasibility in a case study involving agent navigation amid moving obstacles.
arXiv.org Artificial Intelligence
Feb-21-2024
- Country:
- Asia > Middle East
- Iran > Tehran Province
- Tehran (0.04)
- Israel > Haifa District
- Haifa (0.04)
- Iran > Tehran Province
- Europe
- Belgium > Flanders
- Antwerp Province > Antwerp (0.04)
- Denmark > North Jutland
- Aalborg (0.04)
- France > Hauts-de-France
- Netherlands
- North Brabant > Eindhoven (0.04)
- South Holland > Delft (0.04)
- Slovenia > Upper Carniola
- Municipality of Bled > Bled (0.04)
- United Kingdom > England
- Greater London > London (0.04)
- Belgium > Flanders
- North America
- Canada > British Columbia
- Puerto Rico > San Juan
- San Juan (0.04)
- United States
- Arizona > Maricopa County
- Phoenix (0.04)
- California > Los Angeles County
- Long Beach (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Arizona > Maricopa County
- Asia > Middle East
- Genre:
- Overview > Innovation (0.34)
- Research Report > Promising Solution (0.34)