Transferable Reinforcement Learning via Generalized Occupancy Models

Zhu, Chuning, Wang, Xinqi, Han, Tyler, Du, Simon S., Gupta, Abhishek

May-28-2024–arXiv.org Artificial Intelligence

Intelligent agents must be generalists, capable of quickly adapting to various tasks. In reinforcement learning (RL), model-based RL learns a dynamics model of the world, in principle enabling transfer to arbitrary reward functions through planning. However, autoregressive model rollouts suffer from compounding error, making model-based RL ineffective for long-horizon problems. Successor features offer an alternative by modeling a policy's long-term state occupancy, reducing policy evaluation under new tasks to linear reward regression. Yet, policy improvement with successor features can be challenging. This work proposes a novel class of models, i.e., generalized occupancy models (GOMs), that learn a distribution of successor features from a stationary dataset, along with a policy that acts to realize different successor features. These models can quickly select the optimal action for arbitrary new tasks. By directly modeling long-term outcomes in the dataset, GOMs avoid compounding error while enabling rapid transfer across reward functions. We present a practical instantiation of GOMs using diffusion models and show their efficacy as a new class of transferable models, both theoretically and empirically across various simulated robotics problems.

gom, international conference, successor feature, (13 more...)

arXiv.org Artificial Intelligence

May-28-2024

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - Queensland > Brisbane (0.04)
  - New South Wales > Sydney (0.04)
- North America
  - United States
    - Maryland > Baltimore (0.04)
    - Washington > King County
      - Seattle (0.14)
      - Bellevue (0.04)
    - Texas > Travis County
      - Austin (0.04)
    - New York
      - Richmond County > New York City (0.04)
      - Queens County > New York City (0.04)
      - New York County > New York City (0.04)
      - Kings County > New York City (0.04)
      - Bronx County > New York City (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - Hawaii > Honolulu County
      - Honolulu (0.04)
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
- Europe
  - Austria (0.04)
  - Sweden > Stockholm
    - Stockholm (0.04)
- Asia
  - Singapore > Central Region
    - Singapore (0.04)
  - Japan > Honshū
    - Kansai > Osaka Prefecture > Osaka (0.04)
- Africa > Ethiopia
  - Addis Ababa > Addis Ababa (0.04)

Genre:
- Research Report > New Finding (0.48)

Technology:
- Information Technology > Artificial Intelligence
  - Robots (1.00)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.46)
  - Machine Learning
    - Statistical Learning (1.00)
    - Reinforcement Learning (1.00)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found