Predictable MDP Abstraction for Unsupervised Model-Based RL

Jun-3-2023–arXiv.org Artificial Intelligence

A key component of model-based reinforcement learning (RL) is a dynamics model that predicts the outcomes of actions. Errors in this predictive model can degrade the performance of model-based controllers, and complex Markov decision processes (MDPs) can present exceptionally difficult prediction problems. To mitigate this issue, we propose predictable MDP abstraction (PMA): instead of training a predictive model on the original MDP, we train a model on a transformed MDP with a learned action space that only permits predictable, easy-to-model actions, while covering the original state-action space as much as possible. As a result, model learning becomes easier and more accurate, which allows robust, stable model-based planning or model-based RL. This transformation is learned in an unsupervised manner, before any task is specified by the user. Downstream tasks can then be solved with model-based control in a zero-shot fashion, without additional environment interactions. We theoretically analyze PMA and empirically demonstrate that PMA leads to significant improvements over prior unsupervised model-based RL approaches in a range of benchmark environments. Our code and videos are available at https://seohong.me/projects/pma/

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

Jun-3-2023

arXiv.org PDF

Add feedback

Country:
- Europe > Poland (0.04)
- North America > United States
  - Hawaii > Honolulu County
    - Honolulu (0.04)
  - California > Alameda County
    - Berkeley (0.04)
- Asia > Middle East
  - Jordan (0.04)

Genre:
- Research Report (0.82)

Technology:
- Information Technology
  - Modeling & Simulation (1.00)
  - Artificial Intelligence
    - Robots (1.00)
    - Representation & Reasoning (0.93)
    - Machine Learning
      - Reinforcement Learning (0.89)
      - Learning Graphical Models > Undirected Networks
        Markov Models (0.66)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found