Natasha 2: Faster Non-Convex Optimization Than SGD

Dec-31-2018–Neural Information Processing Systems

In diverse world of deep learning research has given rise to numerous architectures for neural networks (convolutional ones, long short term memory ones, etc). However, to this date, the underlying training algorithms for neural networks are still stochastic gradient descent (SGD) and its heuristic variants. In this paper, we address the problem of designing a new algorithm that has provably faster running time than the best known result for SGD.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Dec-31-2018

Conferences PDF

Add feedback

Country:
- North America > Canada (0.14)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Duplicate Docs Excel Report

Title
Natasha 2: Faster Non-Convex Optimization Than SGD
Natasha 2: Faster Non-Convex Optimization Than SGD

Similar Docs Excel Report more

Title	Similarity	Source
None found