Supported Value Regularization for Offline Reinforcement Learning

Neural Information Processing Systems 

Offline reinforcement learning suffers from the extrapolation error and value overestimation caused by out-of-distribution (OOD) actions.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found