Review for NeurIPS paper: Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model

Neural Information Processing Systems 

Weaknesses: The significance of "breaking the barrier" is somewhat suspicious since it appears to be relevant only when there is a lower bound assumption on the accuracy epsilon, which is a bit strange since we want the accuracy to be high, so the error epsilon to be low. In particular, it doesn't appear to improve on previous results if we make epsilon a constant, for example. EDIT: Thank you to the authors for your response. Here is a bit more explanation of my concern. My comment was inspired by thinking about what are the conditions under which the new bound derived by the authors is actually a strict improvement over the previous bound.