Optimizers -- Momentum and Nesterov momentum algorithms (Part 2)

#artificialintelligence 

Welcome to the second part on optimisers where we will be discussing momentum and Nesterov accelerated gradient. If you want a quick review of vanilla gradient descent algorithms and its variants, please read about it in part1. In part3 of this series, I will be explaining RMSprop and Adam in detail. Gradient descent uses the gradients to update the weights and these can be sometimes noisy. In mini-batch gradient descent while we are updating the weights based on the data in a given batch, there will be some variance in the direction of the update.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found