Convergence beyond the over-parameterized regime using Rayleigh quotients

Neural Information Processing Systems 

In this paper, we present a new strategy to prove the convergence of deep learning architectures to a zero training (or even testing) loss by gradient flow. Our analysis is centered on the notion of Rayleigh quotients in order to prove Kurdyka-Łojasiewicz inequalities for a broader set of neural network architectures and loss functions.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found