Offline Reinforcement Learning with Reverse Model-based Imagination

Mar-23-2025, 02:18:43 GMT–Neural Information Processing Systems

In offline reinforcement learning (offline RL), one of the main challenges is to deal with the distributional shift between the learning policy and the given dataset. To address this problem, recent offline RL methods attempt to introduce conservatism bias to encourage learning in high-confidence areas. Model-free approaches directly encode such bias into policy or value function learning using conservative regularizations or special network structures, but their constrained policy search limits the generalization beyond the offline dataset. Model-based approaches learn forward dynamics models with conservatism quantifications and then generate imaginary trajectories to extend the offline datasets. However, due to limited samples in offline datasets, conservatism quantifications often suffer from overgeneralization in out-of-support regions.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Mar-23-2025, 02:18:43 GMT

Conferences PDF

Add feedback

Genre:
- Research Report > New Finding (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
Offline Reinforcement Learning with Reverse Model-based Imagination

Similar Docs Excel Report more

Title	Similarity	Source
None found