Review for NeurIPS paper: Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity

Neural Information Processing Systems 

Additional Feedback: NOTE (post rebuttal): I didn't change my review, as a matter of expediency. But, I appreciate the authors' acknowledgement of my comments and support the plan for addressing them. I guess a brief clarification wouldn't hurt, but I wouldn't suggest using space to delve into the setting more deeply.) "corner stones" - "cornerstones" "the sample complexity of model based MARL algorithms has rarely been investigated": The Rmax paper was one of the first papers to study RL sample complexity AND dealt with MARL. Maybe it hasn't been "recently" investigated?