Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel
Colin Wei, Jason D. Lee, Qiang Liu, Tengyu Ma
–Neural Information Processing Systems
Recent works have shown that on sufficiently over-parametrized neural nets, gradient descent with relatively large initialization optimizes a prediction function in the RKHS of the Neural Tangent Kernel (NTK).
Neural Information Processing Systems
Jan-25-2025, 09:02:15 GMT