Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors

LeCun, Yann, Simard, Patrice Y., Pearlmutter, Barak

Dec-31-1993–Neural Information Processing Systems

Inst., 19600 NW vonNeumann Dr, Beaverton, OR 97006 Abstract We propose a very simple, and well principled way of computing the optimal step size in gradient descent algorithms. The online version is very efficient computationally, and is applicable to large backpropagation networks trained on large data sets. The main ingredient is a technique for estimating the principal eigenvalue(s) and eigenvector(s) of the objective function's second derivative matrix (Hessian),which does not require to even calculate the Hessian. Severalother applications of this technique are proposed for speeding up learning, or for eliminating useless parameters. 1 INTRODUCTION Choosing the appropriate learning rate, or step size, in a gradient descent procedure such as backpropagation, is simultaneously one of the most crucial and expertintensive partof neural-network learning. We propose a method for computing the best step size which is both well-principled, simple, very cheap computationally, and, most of all, applicable to online training with large networks and data sets.

artificial intelligence, hessian, neural network, (14 more...)

Neural Information Processing Systems

Dec-31-1993

Conferences PDF

Add feedback

Country:
- North America > United States
  - Massachusetts > Hampshire County (0.14)
  - Oregon > Washington County
    - Beaverton (0.24)

Industry:
- Education (0.74)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks (1.00)
  - Statistical Learning > Gradient Descent (0.58)

Duplicate Docs Excel Report

Title
Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors
Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors

Similar Docs Excel Report more

Title	Similarity	Source
None found