Stable Nonconvex-Nonconcave Training via Linear Interpolation

Feb-11-2025, 06:20:15 GMT–Neural Information Processing Systems

This paper presents a theoretical analysis of linear interpolation as a principled method for stabilizing (large-scale) neural network training. We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear interpolation can help by leveraging the theory of nonexpansive operators.

artificial intelligence, convergence, machine learning, (17 more...)

Neural Information Processing Systems

Feb-11-2025, 06:20:15 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)