Review for NeurIPS paper: Steady State Analysis of Episodic Reinforcement Learning

Neural Information Processing Systems 

Clarity: In my opinion the main weakness of the paper is its presentation. First, there is a lack of clear, direct, explanations of what the paper is trying to accomplish. Several crucial points are either only implied or mentioned in passing without the proper emphasis. This is true for the positioning of the paper itself. The analysis seems to be mostly concerned with policy gradient methods, but this is never explicitly stated.