Reviews: Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel

Neural Information Processing Systems 

Summary: The paper studies the generalization and optimization aspects of regularized neural networks, and provide two key contributions: (a)they show that a O(d) sample complexity gap between global minima of regularized loss and the induced kernel method. They also establish that in infinite-width two-layer nets, a variant of gradient descent converges to global minimum with of (weakly) regularized cross entropy loss in poly iterations. The paper studies a natural and important problem and makes fundamental contributions in this direction. Recent results in deep learning theory exploits this neural tangent connection to prove optimization and generalization results. In light of this, it is important to study the limitations of this.