Generalization Through Growth: Hidden Dynamics Controls Depth Dependence

Sonoda, Sho, Hashimoto, Yuka, Ishikawa, Isao, Ikeda, Masahiro

May-22-2025–arXiv.org Machine Learning

Recent theory has reduced the depth dependence of generalization bounds from exponential to polynomial and even depth-independent rates, yet these results remain tied to specific architectures and Euclidean inputs. We present a unified framework for arbitrary \blue{pseudo-metric} spaces in which a depth-$k$ network is the composition of continuous hidden maps $f:\mathcal{X}\to \mathcal{X}$ and an output map $h:\mathcal{X}\to \mathbb{R}$. The resulting bound $O(\sqrt{(α+ \log β(k))/n})$ isolates the sole depth contribution in $β(k)$, the word-ball growth of the semigroup generated by the hidden layers. By Gromov's theorem polynomial (resp. exponential) growth corresponds to virtually nilpotent (resp. expanding) dynamics, revealing a geometric dichotomy behind existing $O(\sqrt{k})$ (sublinear depth) and $\tilde{O}(1)$ (depth-independent) rates. We further provide covering-number estimates showing that expanding dynamics yield an exponential parameter saving via compositional expressivity. Our results decouple specification from implementation, offering architecture-agnostic and dynamical-systems-aware guarantees applicable to modern deep-learning paradigms such as test-time inference and diffusion models.

artificial intelligence, machine learning, semigroup, (17 more...)

arXiv.org Machine Learning

May-22-2025

arXiv.org PDF

Add feedback

Country:
- Asia > Japan
  - Honshū > Kansai
    - Kyoto Prefecture > Kyoto (0.04)
    - Osaka Prefecture > Osaka (0.04)
- Europe
  - Sweden > Stockholm
    - Stockholm (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
- North America > United States
  - Illinois > Cook County > Chicago (0.04)

Genre:
- Research Report > New Finding (0.66)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)