Goto

Collaborating Authors

 overparameterization









c82836ed448c41094025b4a872c5341e-Paper.pdf

Neural Information Processing Systems

Recently there has been significant theoretical progress on understanding the convergence andgeneralization ofgradient-based methods onnonconvexlosses withoverparameterized models. Nevertheless, manyaspectsofoptimization and generalization and in particular the critical role of small random initialization are not fully understood.