MOReL: Model-Based Offline Reinforcement Learning
–Neural Information Processing Systems
In offline reinforcement learning (RL), the goal is to learn a highly rewarding policy based solely on a dataset of historical interactions with the environment. This serves as an extreme test for an agent's ability to effectively use historical data which is known to be critical for efficient RL. Prior work in offline RL has been confined almost exclusively to model-free RL approaches.
Neural Information Processing Systems
Dec-24-2025, 21:54:40 GMT
- Technology: