Reviews: The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares

Neural Information Processing Systems 

Originality: I'm not an expert on this subfield, but as far as I know the work is original. Quality: I think this is a very nice paper. The results are interesting and clearly stated. And the work addresses an important question about the difference between theoretical learning rate schedules / averaging schemes and those used in practice. I have a couple of major comments: 1) organisation of the paper.