Towards Instance-Optimal Offline Reinforcement Learning with Pessimism

Open in new window