Paper Summary: On the importance of initialization and momentum in deep learning

Jul-28-2022, 20:17:12 GMT–#artificialintelligence

The update equations are given above. The basic idea behind CM is that it accumulates a velocity vector in directions of persistent reduction in the objective across iterations. Directions of low-curvature which are suffering from a slow local change in their reduction, these will tend to persist across iterations and hence be amplified by the use of CM. Nesterov's Accelerated Gradient (NAG) is now described by the authors (update equations given above). While CM computes the gradient update from the current position θt, NAG first performs a partial update to θt, computing θt μvt, which is similar to θt 1, but missing the as yet unknown correction.

artificial intelligence, initialization and momentum, machine learning, (7 more...)

#artificialintelligence

Jul-28-2022, 20:17:12 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found