Modified Gauss-Newton Algorithms under Noise
Pillutla, Krishna, Roulet, Vincent, Kakade, Sham, Harchaoui, Zaid
–arXiv.org Artificial Intelligence
The Gauss-Newton method and its variants such as the Levenberg-Marquardt method [15, 16] have been applied successfully in phase retrieval [5, 11, 20], nonlinear control [22, 24], and non-negative matrix factorization [12]. Modern machine learning problems such as deep learning possess a similar compositional structure, which makes Gauss-Newton-like algorithms potential good candidates [8, 26, 30]. However, in such problems, we are often interested in the generalization performance on unseen data. It is unclear whether the additional cost of solving the subproblems can be amortized by the superior efficiency of Gauss-Newton-like algorithms. In this paper, we investigate whether modified Gauss-Newton methods or prox-linear algorithms with incremental gradient inner loops are superior to direct stochastic subgradient algorithms for nonsmooth problems with a compositional objective and a finite-sum structure in terms of generalization error.
arXiv.org Artificial Intelligence
May-17-2023