The Implicit Bias of AdaGrad on Separable Data

Jun-9-2019–arXiv.org Machine Learning

In recent years, implicit regularization from various optimization algorithms plays a crucial role in the generalizatiion abilities in training deep neural networks [Salakhutdinov and Srebro, 2015, Neyshabur et al., 2015, Keskar et al., 2016, Neyshabur et al., 2017, Zhang et al., 2017]. For example, in underdetermined problems where the number of parameters is larger than the number of training examples, many global optimum fail to exhibit good generalization properties, however, a specific optimization algorithm (such as gradient descent) does converge to a particular optimum that generalize well, although no explicit regularization is enforced when training the model.

artificial intelligence, asymptotic direction, machine learning, (17 more...)

arXiv.org Machine Learning

Jun-9-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.28)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Statistical Learning (1.00)
  - Neural Networks > Deep Learning (0.34)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found