Adam - momentum y (aka. cost) terms. • r/MachineLearning

#artificialintelligence 

It was the Newton-Raphson method for finding roots of an equation. I thought this method mostly applies for minimization in machine learning as cost is always defined as a positive real valued function. But it was pointed out to me that the update equation of newton-raphson method, which is x x - y / dy_dx, is unstable at local minimas (where dy_dx 0) since it makes the update burst to infinity. Eventually, I landed on this update equation, x x - ((y * dy_dx) / (y dy_dx2)); dy_dx derivative of y wrt. To relate this update equation with the title: if we consider the update portion of the equation - g(x, y) (y * x) / (y x2); y 0 It is quite similar to adam since there is a square gradient term in the denominator and the gradient term in the numerator.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found