Reviews: Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks
–Neural Information Processing Systems
Originality: To the best of my knowledge, the results are novel and provide important extensions/improvements over the previous art. Quality: I did a high level check of the proofs and it seems sound to me. Clarity: the paper is a joy to read. The problem definition, assumptions, the algorithm, and statement of results are very well presented. Significance: the results provide several extensions and improvements over the previous work, including training deeper models, training all layers, training with SGD (rather than GD), and smaller required overparameterization.
Neural Information Processing Systems
Jan-27-2025, 04:05:54 GMT
- Technology: