It's Only Natural: An Excessively Deep Dive Into Natural Gradient Optimization

Mar-10-2019, 15:43:40 GMT–#artificialintelligence

I'm going to tell a story: one you've almost certainly heard before, but with a different emphasis than you're used to. To a first (order) approximation, all modern deep learning models are trained using gradient descent. At each step of gradient descent, your parameter values begin at some starting point, and you move them in the direction of greatest loss reduction. You do this by taking the derivative of your loss with respect to your whole vector of parameters, otherwise called the Jacobian. However, this is just the first derivative of your loss, and it doesn't tell you anything about curvature, or, how quickly your first derivative is changing.

artificial intelligence, deep learning, machine learning, (14 more...)

#artificialintelligence

Mar-10-2019, 15:43:40 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (0.69)
  - Statistical Learning > Gradient Descent (0.56)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found