Worst-Case Offline Reinforcement Learning with Arbitrary Data Support

Neural Information Processing Systems 

We propose a method of offline reinforcement learning (RL) featuring the performance guarantee without any assumptions on the data support.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found