Generalization Through Growth: Hidden Dynamics Controls Depth Dependence
Sonoda, Sho, Hashimoto, Yuka, Ishikawa, Isao, Ikeda, Masahiro
Recent theory has reduced the depth dependence of generalization bounds from exponential to polynomial and even depth-independent rates, yet these results remain tied to specific architectures and Euclidean inputs. We present a unified framework for arbitrary \blue{pseudo-metric} spaces in which a depth-\(k\) network is the composition of continuous hidden maps \(f:\mathcal{X}\to \mathcal{X}\) and an output map \(h:\mathcal{X}\to \mathbb{R}\). The resulting bound $O(\sqrt{(α+ \log β(k))/n})$ isolates the sole depth contribution in \(β(k)\), the word-ball growth of the semigroup generated by the hidden layers. By Gromov's theorem polynomial (resp. exponential) growth corresponds to virtually nilpotent (resp. expanding) dynamics, revealing a geometric dichotomy behind existing $O(\sqrt{k})$ (sublinear depth) and $\tilde{O}(1)$ (depth-independent) rates. We further provide covering-number estimates showing that expanding dynamics yield an exponential parameter saving via compositional expressivity. Our results decouple specification from implementation, offering architecture-agnostic and dynamical-systems-aware guarantees applicable to modern deep-learning paradigms such as test-time inference and diffusion models.
May-22-2025
- Country:
- Asia > Japan
- Honshū > Kansai
- Kyoto Prefecture > Kyoto (0.04)
- Osaka Prefecture > Osaka (0.04)
- Honshū > Kansai
- Europe
- Sweden > Stockholm
- Stockholm (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Sweden > Stockholm
- North America > United States
- Illinois > Cook County > Chicago (0.04)
- Asia > Japan
- Genre:
- Research Report > New Finding (0.66)
- Technology: