Near-Optimal Distributed Minimax Optimization under the Second-Order Similarity

May-26-2025, 20:37:39 GMT–Neural Information Processing Systems

This paper considers the distributed convex-concave minimax optimization under the second-order similarity.We propose stochastic variance-reduced optimistic gradient sliding (SVOGS) method, which takes the advantage of the finite-sum structure in the objective by involving the mini-batch client sampling and variance reduction.We prove SVOGS can achieve the \varepsilon -duality gap within communication rounds of {\mathcal O}(\delta D 2/\varepsilon), communication complexity of {\mathcal O}(n \sqrt{n}\delta D 2/\varepsilon),and local gradient calls of \tilde{\mathcal O}(n (\sqrt{n}\delta L)D 2/\varepsilon\log(1/\varepsilon)), where n is the number of nodes, \delta is the degree of the second-order similarity, L is the smoothness parameter and D is the diameter of the constraint set.We can verify that all of above complexity (nearly) matches the corresponding lower bounds.For the specific \mu -strongly-convex- \mu -strongly-convex case, our algorithm has the upper bounds on communication rounds, communication complexity, and local gradient calls of \mathcal O(\delta/\mu\log(1/\varepsilon)), {\mathcal O}((n \sqrt{n}\delta/\mu)\log(1/\varepsilon)), and \tilde{\mathcal O}(n (\sqrt{n}\delta L)/\mu)\log(1/\varepsilon)) respectively, which are also nearly tight.Furthermore, we conduct the numerical experiments to show the empirical advantages of proposed method.

artificial intelligence, second-order similarity, varepsilon, (8 more...)

Neural Information Processing Systems

May-26-2025, 20:37:39 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.64)