Reviews: Evolution-Guided Policy Gradient in Reinforcement Learning

Neural Information Processing Systems 

Post discussion update: I have increased my score. In particular they took to heart my concern about running more experiments to tease apart why the system is performing well. Obviously they did not run all the experiments I asked for, but I hope they consider doing even more if accepted. I would still like to emphasize that the paper is much more interesting if you remove the focus on SOTA results. Understanding why your system works well, and when it doesn't is much more likely to have a long-lasting scientific impact on the field whereas SOTA changes frequently.