An Improved Analysis of Training Over-parameterized Deep Neural Networks
–Neural Information Processing Systems
Arecent lineofresearch hasshownthatgradient-based algorithms withrandom initialization can converge to the global minima of the training loss for overparameterized (i.e.,sufficiently wide)deepneuralnetworks. However,thecondition onthewidth oftheneural networktoensure theglobal convergence isvery stringent, which is often a high-degree polynomial in the training sample size n (e.g., O(n24)).
Neural Information Processing Systems
Feb-12-2026, 11:47:24 GMT
- Country:
- Technology: