Spectral Bias Outside the Training Set for Deep Networks in the Kernel Regime

Mar-27-2025, 15:57:00 GMT–Neural Information Processing Systems

An implication of the bounds is that the network is biased to learn the top eigenfunctions of the Neural Tangent Kernel not just on the training set but over the entire input space. This bias depends on the model architecture and input distribution alone and thus does not depend on the target function which does not need to be in the RKHS of the kernel. The result is valid for deep architectures with fully connected, convolutional, and residual layers. Furthermore the width does not need to grow polynomially with the number of samples in order to obtain high probability bounds up to a stopping time. The proof exploits the low-effective-rank property of the Fisher Information Matrix at initialization, which implies a low effective dimension of the model (far smaller than the number of parameters). We conclude that local capacity control from the low effective rank of the Fisher Information Matrix is still underexplored theoretically.

artificial intelligence, machine learning, proceedings, (15 more...)

Neural Information Processing Systems

Mar-27-2025, 15:57:00 GMT

Conferences PDF

Add feedback

Country:
- North America > United States (0.68)

Genre:
- Research Report (0.68)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Duplicate Docs Excel Report

Title
Spectral Bias Outside the Training Set for Deep Networks in the Kernel Regime

Similar Docs Excel Report more

Title	Similarity	Source
None found