Review for NeurIPS paper: Constrained episodic reinforcement learning in concave-convex and knapsack settings

Neural Information Processing Systems 

Weaknesses: My major concerns: 1. line 248 suggested linear programming could be used in ConPlanner, but instead the experiment tested on different unconstrained RL planners under Lagrangian heuristic. I think the papers should have compared results of different constrained problem solver. While theoretical proof was plenty, the paper didn't provide any empirical support, making this method less intuitive. Although the paper claimed they compared the proposed framework with other concave-convex approaches, the problems they experimented on didn't seem to be concave-convex. Grid world problem such as Mars rover applied in the paper has linear constraints instead of convex ones.