DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPs

Open in new window