AITopics | automatic learning rate maximization

Collaborating Authors

automatic learning rate maximization

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors

Neural Information Processing SystemsApr-6-2023, 19:06:09 GMT

We propose a very simple, and well principled way of computing the optimal step size in gradient descent algorithms. The on-line version is very efficient computationally, and is applicable to large backpropagation networks trained on large data sets. The main ingredient is a technique for estimating the principal eigenvalue(s) and eigenvector(s) of the objective function's second derivative ma(cid:173) trix (Hessian), which does not require to even calculate the Hes(cid:173) sian. Several other applications of this technique are proposed for speeding up learning, or for eliminating useless parameters.

automatic learning rate maximization, hessian, on-line estimation, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors

LeCun, Yann, Simard, Patrice Y., Pearlmutter, Barak

Neural Information Processing SystemsDec-31-1993

We propose a very simple, and well principled way of computing the optimal step size in gradient descent algorithms. The online version is very efficient computationally, and is applicable to large backpropagation networks trained on large data sets. The main ingredient is a technique for estimating the principal eigenvalue(s) and eigenvector(s) of the objective function's second derivative matrix (Hessian), which does not require to even calculate the Hessian. Several other applications of this technique are proposed for speeding up learning, or for eliminating useless parameters. 1 INTRODUCTION Choosing the appropriate learning rate, or step size, in a gradient descent procedure such as backpropagation, is simultaneously one of the most crucial and expertintensive part of neural-network learning. We propose a method for computing the best step size which is both well-principled, simple, very cheap computationally, and, most of all, applicable to online training with large networks and data sets.

artificial intelligence, eigenvalue, machine learning, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > Colorado > Denver County > Denver (0.05)
(2 more...)

Industry: Education > Educational Setting > Online (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.58)

Add feedback

Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors

LeCun, Yann, Simard, Patrice Y., Pearlmutter, Barak

Neural Information Processing SystemsDec-31-1993

automatic learning rate maximization, eigenvalue, hessian, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > Colorado > Denver County > Denver (0.05)
(2 more...)

Industry: Education > Educational Setting > Online (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.58)

Add feedback

Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors

LeCun, Yann, Simard, Patrice Y., Pearlmutter, Barak

Neural Information Processing SystemsDec-31-1993

Inst., 19600 NW vonNeumann Dr, Beaverton, OR 97006 Abstract We propose a very simple, and well principled way of computing the optimal step size in gradient descent algorithms. The online version is very efficient computationally, and is applicable to large backpropagation networks trained on large data sets. The main ingredient is a technique for estimating the principal eigenvalue(s) and eigenvector(s) of the objective function's second derivative matrix (Hessian),which does not require to even calculate the Hessian. Severalother applications of this technique are proposed for speeding up learning, or for eliminating useless parameters. 1 INTRODUCTION Choosing the appropriate learning rate, or step size, in a gradient descent procedure such as backpropagation, is simultaneously one of the most crucial and expertintensive partof neural-network learning. We propose a method for computing the best step size which is both well-principled, simple, very cheap computationally, and, most of all, applicable to online training with large networks and data sets.

artificial intelligence, eigenvalue, machine learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Oregon > Washington County > Beaverton (0.24)
North America > United States > Massachusetts > Hampshire County (0.14)

Industry: Education > Educational Setting > Online (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.58)

Add feedback