Reviews: Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel

Neural Information Processing Systems 

This paper investigates how the regularization helps for training neural networks in contrast to the unregularized neural tangent kernel method. It is shown that regularization captures "informative signal" but the NTK model does not, which highlights the effectiveness of the regularization. Moreover, this paper shows polynomial time convergence of gradient flow corresponding to the infinite width neural network. The contribution is novel and the implication is quite instructive to neural tangent kernel learning. Especially, the lower bound evaluation for kernel learning is a novel contribution.