When to Trust Your Simulator: Dynamics-Aware Hybrid Offline-and-Online Reinforcement Learning

Jan-19-2025, 06:01:58 GMT–Neural Information Processing Systems

Learning effective reinforcement learning (RL) policies to solve real-world complex tasks can be quite challenging without a high-fidelity simulation environment. In most cases, we are only given imperfect simulators with simplified dynamics, which inevitably lead to severe sim-to-real gaps in RL policy learning. The recently emerged field of offline RL provides another possibility to learn policies directly from pre-collected historical data. However, to achieve reasonable performance, existing offline RL algorithms need impractically large offline data with sufficient state-action space coverage for training. This brings up a new question: is it possible to combine learning from limited real data in offline RL and unrestricted exploration through imperfect simulators in online RL to address the drawbacks of both approaches?

dynamic-aware hybrid offline-and-online reinforcement learning, imperfect simulator, offline rl algorithm, (1 more...)

Neural Information Processing Systems

Jan-19-2025, 06:01:58 GMT

Conferences Web Page

Add feedback

Genre:
- Instructional Material > Online (0.44)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)