Offline Model-based Adaptable Policy Learning
–Neural Information Processing Systems
In reinforcement learning, a promising direction to avoid online trial-and-error costs is learning from an offline dataset. Current offline reinforcement learning methods commonly learn in the policy space constrained to in-support regions by the offline dataset, in order to ensure the robustness of the outcome policies.
Neural Information Processing Systems
Dec-24-2025, 01:42:12 GMT
- Technology: