A Universal Law of Robustness via Isoperimetry

Jan-19-2025, 13:09:11 GMT–Neural Information Processing Systems

Classically, data interpolation with a parametrized model class is possible as long as the number of parameters is larger than the number of equations to be satisfied. A puzzling phenomenon in the current practice of deep learning is that models are trained with many more parameters than what this classical theory would suggest. We propose a theoretical explanation for this phenomenon. We prove that for a broad class of data distributions and model classes, overparametrization is {\em necessary} if one wants to interpolate the data {\em smoothly}. Namely we show that {\em smooth} interpolation requires d times more parameters than mere interpolation, where d is the ambient data dimension.

isoperimetry, model class, universal law, (3 more...)

Neural Information Processing Systems

Jan-19-2025, 13:09:11 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.44)