What, Why, and How of SGD Momentum Optimizer in Deep Learning

Sep-26-2022, 09:34:13 GMT–#artificialintelligence

In deep learning, we have used stochastic gradient descent as one of the optimizers because at the end we will find the minimum weight and bias at which the model loss is lowest. In the SGD we have some issues in which the SGD does not work perfectly because in deep learning we got a non-convex cost function graph and if use the simple SGD then it leads to low performance. At the start, we randomly start at some point and we are going to end up at the local minimum and not able to reach the global minimum. A saddle point is a point where in one direction the surface goes in the upward direction and in another direction it goes downwards. So that the slope is changing very gradually so the speed of changing is going to slow and as result, the training also going to slow.

global minima, minima, sgd, (8 more...)

#artificialintelligence

Sep-26-2022, 09:34:13 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (0.83)
  - Statistical Learning > Gradient Descent (0.56)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found