Bellman-consistent Pessimism for Offline Reinforcement Learning

Open in new window