Generalization Bound of Gradient Flow through Training Trajectory and Data-dependent Kernel
–Neural Information Processing Systems
Gradient-based optimization methods have shown remarkable empirical success, yet their theoretical generalization properties remain only partially understood. In this paper, we establish a generalization bound for gradient flow that aligns with the classical Rademacher complexity bounds for kernel methods--specifically those based on the RKHS norm and kernel trace--through a data-dependent kernel called the loss path kernel (LPK).
Neural Information Processing Systems
Jun-10-2026, 06:47:29 GMT
- Technology: