Supported Value Regularization for Offline Reinforcement Learning
–Neural Information Processing Systems
Offline reinforcement learning suffers from the extrapolation error and value overestimation caused by out-of-distribution (OOD) actions.
Neural Information Processing Systems
Oct-8-2025, 23:41:44 GMT
- Country:
- Asia > China
- Liaoning Province > Dalian (0.04)
- North America > United States
- Washington > King County > Seattle (0.04)
- Asia > China
- Technology: