Natasha 2: Faster Non-Convex Optimization Than SGD

Feb-13-2026, 08:38:57 GMT–Neural Information Processing Systems

In diverse world of deep learning research has given rise to numerous architectures for neural networks(convolutionalones,longshorttermmemoryones,etc). However,tothisdate,theunderlying training algorithms for neural networks are still stochastic gradient descent (SGD) and its heuristic variants.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Feb-13-2026, 08:38:57 GMT

Conferences PDF

Add feedback

Country:
- North America
  - Canada (0.04)
  - United States > Massachusetts
    - Middlesex County > Cambridge (0.04)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (0.68)
  - Statistical Learning > Gradient Descent (0.56)

Duplicate Docs Excel Report

Title
Natasha 2: Faster Non-Convex Optimization Than SGD
Natasha 2: Faster Non-Convex Optimization Than SGD

Similar Docs Excel Report more

Title	Similarity	Source
None found