COMBO: ConservativeOfflineModel-Based PolicyOptimization
–Neural Information Processing Systems
Offline reinforcement learning (offline RL) [30,34]refers tothe setting where policies are trained using static, previously collected datasets. This presents an attractive paradigm for data reuse and safe policy learning in many applications, such as healthcare [62], autonomous driving [65], robotics [25, 48], and personalized recommendation systems [59].
Neural Information Processing Systems
Feb-11-2026, 21:33:17 GMT
- Country:
- Industry:
- Health & Medicine (0.34)
- Technology: