COMBO: ConservativeOfflineModel-Based PolicyOptimization

Neural Information Processing Systems 

Offline reinforcement learning (offline RL) [30,34]refers tothe setting where policies are trained using static, previously collected datasets. This presents an attractive paradigm for data reuse and safe policy learning in many applications, such as healthcare [62], autonomous driving [65], robotics [25, 48], and personalized recommendation systems [59].

Similar Docs  Excel Report  more

TitleSimilaritySource
None found