Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning

Open in new window