Reviews: The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares
–Neural Information Processing Systems
Originality: I'm not an expert on this subfield, but as far as I know the work is original. Quality: I think this is a very nice paper. The results are interesting and clearly stated. And the work addresses an important question about the difference between theoretical learning rate schedules / averaging schemes and those used in practice. I have a couple of major comments: 1) organisation of the paper.
Neural Information Processing Systems
Jan-22-2025, 15:34:18 GMT
- Technology: