Note on Learning Rate Schedules for Stochastic Optimization

Darken, Christian, Moody, John E.

Neural Information Processing Systems 

We present and compare learning rate schedules for stochastic gradient descent, a general algorithm which includes LMS, online backpropagation and k-means clustering as special cases. We introduce "search-thenconverge" type schedules which outperform the classical constant and "running average" (1ft) schedules both in speed of convergence and quality of solution.