nonconvex-concave minimax problem
SAPD+: An Accelerated Stochastic Method for Nonconvex-Concave Minimax Problems
We propose a new stochastic method SAPD+ for solving nonconvex-concave minimax problems of the form $\min\max\mathcal{L}(x,y)=f(x)+\Phi(x,y)-g(y)$, where $f,g$ are closed convex and $\Phi(x,y)$ is a smooth function that is weakly convex in $x$, (strongly) concave in $y$. For both strongly concave and merely concave settings, SAPD+ achieves the best known oracle complexities of $\mathcal{O}(L\kappa_y\epsilon^{-4})$ and $\mathcal{O}(L^3\epsilon^{-6})$, respectively, without assuming compactness of the problem domain, where $\kappa_y$ is the condition number, and $L$ is the Lipschitz constant. We also propose SAPD+ with variance reduction, which enjoys the best known oracle complexity of $\mathcal{O}(L\kappa_y^2\epsilon^{-3})$ for weakly convex-strongly concave setting. We demonstrate the efficiency of SAPD+ on a distributionally robust learning problem with a nonconvex regularizer and also on a multi-class classification problem in deep learning.
Review for NeurIPS paper: Hybrid Variance-Reduced SGD Algorithms For Minimax Problems with Nonconvex-Linear Function
The paper introduces a single-loop stochastic algorithm for solving a special class of nonconvex-concave minimax problems that achieves best-known complexity bound. The rebuttal addressed most of the reviewers' concerns on the algorithmic justification, although some concern remains in terms of the special structure. However, please consider revising the paper to address R1 and R3 's remarks, in particular: - Adjust the title to reflect the special structure instead of overclaim the contribution; - Elaborate the desirable property of single-loop algorithm over existing methods; - Add detailed comparisons to prior work including prox-linear algorithms for compositional problems and recent algorithms for general nonconvex-concave minimax problems.
SAPD+: An Accelerated Stochastic Method for Nonconvex-Concave Minimax Problems
We propose a new stochastic method SAPD for solving nonconvex-concave minimax problems of the form \min\max\mathcal{L}(x,y) f(x) \Phi(x,y)-g(y), where f,g are closed convex and \Phi(x,y) is a smooth function that is weakly convex in x, (strongly) concave in y . For both strongly concave and merely concave settings, SAPD achieves the best known oracle complexities of \mathcal{O}(L\kappa_y\epsilon {-4}) and \mathcal{O}(L 3\epsilon {-6}), respectively, without assuming compactness of the problem domain, where \kappa_y is the condition number, and L is the Lipschitz constant. We also propose SAPD with variance reduction, which enjoys the best known oracle complexity of \mathcal{O}(L\kappa_y 2\epsilon {-3}) for weakly convex-strongly concave setting. We demonstrate the efficiency of SAPD on a distributionally robust learning problem with a nonconvex regularizer and also on a multi-class classification problem in deep learning.
Two Completely Parameter-Free Alternating Gradient Projection Algorithms for Nonconvex-(strongly) Concave Minimax Problems
Yang, Junnan, Zhang, Huiling, Xu, Zi
Due to their importance in various emerging applications, efficient algorithms for solving minimax problems have recently received increasing attention. However, many existing algorithms require prior knowledge of the problem parameters in order to achieve optimal iteration complexity. In this paper, we propose two completely parameter-free alternating gradient projection algorithms, i.e., the PF-AGP-NSC algorithm and the PF-AGP-NC algorithm, to solve the smooth nonconvex-strongly concave and nonconvex-concave minimax problems respectively using a backtracking strategy, which does not require prior knowledge of parameters such as the Lipschtiz constant $L$ or the strongly concave constant $\mu$. Moreover, we show that the total number of gradient calls of the PF-AGP-NSC algorithm and the PF-AGP-NC algorithm to obtain an $\varepsilon$-stationary point is upper bounded by $\mathcal{O}\left( L\kappa^3\varepsilon^{-2} \right)$ and $\mathcal{O}\left( L^4\varepsilon^{-4} \right)$ respectively, where $\kappa$ is the condition number. As far as we know, the PF-AGP-NSC algorithm and the PF-AGP-NC algorithm are the first completely parameter-free algorithms for solving nonconvex-strongly concave minimax problems and nonconvex-concave minimax problems respectively. Numerical results validate the efficiency of the proposed PF-AGP algorithm.
An accelerated first-order regularized momentum descent ascent algorithm for stochastic nonconvex-concave minimax problems
Stochastic nonconvex minimax problems have attracted wide attention in machine learning, signal processing and many other fields in recent years. In this paper, we propose an accelerated first-order regularized momentum descent ascent algorithm (FORMDA) for solving stochastic nonconvex-concave minimax problems. The iteration complexity of the algorithm is proved to be $\tilde{\mathcal{O}}(\varepsilon ^{-6.5})$ to obtain an $\varepsilon$-stationary point, which achieves the best-known complexity bound for single-loop algorithms to solve the stochastic nonconvex-concave minimax problems under the stationarity of the objective function.
Zeroth-Order Alternating Randomized Gradient Projection Algorithms for General Nonconvex-Concave Minimax Problems
Xu, Zi, Shen, Jingjing, Wang, Ziqi, Dai, Yuhong
In this paper, we study zeroth-order algorithms for nonconvex-concave minimax problems, which have attracted widely attention in machine learning, signal processing and many other fields in recent years. We propose a zeroth-order alternating randomized gradient projection (ZO-AGP) algorithm for smooth nonconvex-concave minimax problems, and its iteration complexity to obtain an $\varepsilon$-stationary point is bounded by $\mathcal{O}(\varepsilon^{-4})$, and the number of function value estimation is bounded by $\mathcal{O}(d_{x}\varepsilon^{-4}+d_{y}\varepsilon^{-6})$ per iteration. Moreover, we propose a zeroth-order block alternating randomized proximal gradient algorithm (ZO-BAPG) for solving block-wise nonsmooth nonconvex-concave minimax optimization problems, and the iteration complexity to obtain an $\varepsilon$-stationary point is bounded by $\mathcal{O}(\varepsilon^{-4})$ and the number of function value estimation per iteration is bounded by $\mathcal{O}(K d_{x}\varepsilon^{-4}+d_{y}\varepsilon^{-6})$. To the best of our knowledge, this is the first time that zeroth-order algorithms with iteration complexity gurantee are developed for solving both general smooth and block-wise nonsmooth nonconvex-concave minimax problems. Numerical results on data poisoning attack problem validate the efficiency of the proposed algorithms.
On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems
Lin, Tianyi, Jin, Chi, Jordan, Michael I.
We consider nonconvex-concave minimax problems, $\min_{x} \max_{y\in\mathcal{Y}} f(x, y)$, where $f$ is nonconvex in $x$ but concave in $y$. The standard algorithm for solving this problem is the celebrated gradient descent ascent (GDA) algorithm, which has been widely used in machine learning, control theory and economics. However, despite the solid theory for the convex-concave setting, GDA can converge to limit cycles or even diverge in a general setting. In this paper, we present a nonasymptotic analysis of GDA for solving nonconvex-concave minimax problems, showing that GDA can find a stationary point of the function $\Phi(\cdot) :=\max_{y\in\mathcal{Y} }f(\cdot, y)$ efficiently. To the best our knowledge, this is the first theoretical guarantee for GDA in this setting, shedding light on its practical performance in many real applications.