Poor starting points in machine learning

Feb-8-2016–arXiv.org Machine Learning

In many settings, the method of Robbins and Monro (online stochastic gradient descent) is known to be optimal for good starting points, but may not be optimal for poor starting points -- indeed, for poor starting points Nesterov acceleration can help during the initial iterations, even though Nesterov methods not designed for stochastic approximation could hurt during later iterations. A good option is to roll off Nesterov acceleration for later iterations. The common practice of training with nontrivial minibatches enhances the advantage of Nesterov acceleration.

artificial intelligence, iteration, machine learning, (16 more...)

arXiv.org Machine Learning

Feb-8-2016

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.65)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.59)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found