Convergence beyond the over-parameterized regime using Rayleigh quotients
–Neural Information Processing Systems
In this paper, we present a new strategy to prove the convergence of deep learning architectures to a zero training (or even testing) loss by gradient flow. Our analysis is centered on the notion of Rayleigh quotients in order to prove Kurdyka-Łojasiewicz inequalities for a broader set of neural network architectures and loss functions.
Neural Information Processing Systems
Aug-14-2025, 13:17:06 GMT
- Country:
- Europe
- France > Occitanie
- Haute-Garonne > Toulouse (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- France > Occitanie
- Europe
- Technology: