Near-Optimal Offline Reinforcement Learning via Double Variance Reduction

Open in new window