Neurons learn slower than they think

Apr-2-2021–arXiv.org Artificial Intelligence

Recent studies revealed complex convergence dynamics in gradient-based methods, which has been little understood so far. Changing the step size to balance between high convergence rate and small generalization error may not be sufficient: maximizing the test accuracy usually requires a larger learning rate than minimizing the training loss. To explore the dynamic bounds of convergence rate, this study introduces \textit{differential capability} into an optimization process, which measures whether the test accuracy increases as fast as a model approaches the decision boundary in a classification problem. The convergence analysis showed that: 1) a higher convergence rate leads to slower capability growth; 2) a lower convergence rate results in faster capability growth and decay; 3) regulating a convergence rate in either direction reduces differential capability.

convergence rate, differential capability, gradient descent, (13 more...)

arXiv.org Artificial Intelligence

Apr-2-2021

arXiv.org PDF

Add feedback

Country:
- Asia > Russia (0.04)
- North America > United States
  - New York (0.04)
- Europe
  - Russia (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.14)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.52)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found