Model-Based Reinforcement Learning Is Minimax-Optimal for Offline Zero-Sum Markov Games

Open in new window