Using Curvature Information for Fast Stochastic Search

Orr, Genevieve B., Leen, Todd K.

Neural Information Processing Systems 

We present an algorithm for fast stochastic gradient descent that uses a nonlinear adaptive momentum scheme to optimize the late time convergence rate. The algorithm makes effective use of curvature information, requires only O(n) storage and computation, and delivers convergence rates close to the theoretical optimum. We demonstrate the technique on linear and large nonlinear backprop networks.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found