Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks

Neural Information Processing Systems 

Even if one replaces the real labels of a training data set with purely random labels, an over-parameterized neural network can still fit the training data perfectly.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found