AITopics | order stationary point

Collaborating Authors

order stationary point

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Efficiently avoiding saddle points with zero order methods: No gradients required

Emmanouil-Vasileios Vlatakis-Gkaragkounis, Lampros Flokas, Georgios Piliouras

Neural Information Processing SystemsFeb-11-2026, 12:36:43 GMT

Neural Information Processing Systems http://nips.cc/

gradient descent, order method, saddle point, (13 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > Canada > Quebec > Montreal (0.05)
Europe > Spain > Canary Islands (0.04)
(14 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.33)

Add feedback

Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods

Neural Information Processing SystemsDec-25-2025, 03:32:46 GMT

Recent applications that arise in machine learning have surged significant interest in solving min-max saddle point games. This problem has been extensively studied in the convex-concave regime for which a global equilibrium solution can be computed efficiently. In this paper, we study the problem in the non-convex regime and show that an $\varepsilon$--first order stationary point of the game can be computed when one of the player's objective can be optimized to global optimality efficiently. In particular, we first consider the case where the objective of one of the players satisfies the Polyak-{\L}ojasiewicz (PL) condition. For such a game, we show that a simple multi-step gradient descent-ascent algorithm finds an $\varepsilon$--first order stationary point of the problem in $\widetilde{\mathcal{O}}(\varepsilon^{-2})$ iterations. Then we show that our framework can also be applied to the case where the objective of the ``max-player is concave. In this case, we propose a multi-step gradient descent-ascent algorithm that finds an $\varepsilon$--first order stationary point of the game in $\widetilde{\cal O}(\varepsilon^{-3.5})$

name change, non-convex min-max game, order stationary point, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Implicit Bias of Gradient Descent on Linear Convolutional Networks

Suriya Gunasekar, Jason D. Lee, Daniel Soudry, Nati Srebro

Neural Information Processing SystemsNov-20-2025, 14:23:06 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, gradient descent, machine learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.47)

Add feedback

Efficiently avoiding saddle points with zero order methods: No gradients required

Emmanouil-Vasileios Vlatakis-Gkaragkounis, Lampros Flokas, Georgios Piliouras

Neural Information Processing SystemsOct-2-2025, 03:41:34 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, saddle point, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Asia > Middle East > Jordan (0.05)
North America > Canada > Quebec > Montreal (0.05)
(15 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.33)

Add feedback

Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods

Neural Information Processing SystemsMay-27-2025, 09:02:34 GMT

Recent applications that arise in machine learning have surged significant interest in solving min-max saddle point games. This problem has been extensively studied in the convex-concave regime for which a global equilibrium solution can be computed efficiently. In this paper, we study the problem in the non-convex regime and show that an \varepsilon --first order stationary point of the game can be computed when one of the player's objective can be optimized to global optimality efficiently. In particular, we first consider the case where the objective of one of the players satisfies the Polyak-{\L}ojasiewicz (PL) condition. For such a game, we show that a simple multi-step gradient descent-ascent algorithm finds an \varepsilon --first order stationary point of the problem in \widetilde{\mathcal{O}}(\varepsilon {-2}) iterations.

non-convex min-max game, order stationary point, varepsilon, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods

Neural Information Processing SystemsOct-9-2024, 16:55:58 GMT

Recent applications that arise in machine learning have surged significant interest in solving min-max saddle point games. This problem has been extensively studied in the convex-concave regime for which a global equilibrium solution can be computed efficiently. In this paper, we study the problem in the non-convex regime and show that an \varepsilon --first order stationary point of the game can be computed when one of the player's objective can be optimized to global optimality efficiently. In particular, we first consider the case where the objective of one of the players satisfies the Polyak-{\L}ojasiewicz (PL) condition. For such a game, we show that a simple multi-step gradient descent-ascent algorithm finds an \varepsilon --first order stationary point of the problem in \widetilde{\mathcal{O}}(\varepsilon {-2}) iterations.

artificial intelligence, machine learning, varepsilon, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Greedy Pruning with Group Lasso Provably Generalizes for Matrix Sensing

Rajaraman, Nived, Devvrit, null, Mokhtari, Aryan, Ramchandran, Kannan

arXiv.org Artificial IntelligenceJun-4-2023

Pruning schemes have been widely used in practice to reduce the complexity of trained models with a massive number of parameters. In fact, several practical studies have shown that if a pruned model is fine-tuned with some gradient-based updates it generalizes well to new samples. Although the above pipeline, which we refer to as pruning + fine-tuning, has been extremely successful in lowering the complexity of trained models, there is very little known about the theory behind this success. In this paper, we address this issue by investigating the pruning + fine-tuning framework on the overparameterized matrix sensing problem with the ground truth $U_\star \in \mathbb{R}^{d \times r}$ and the overparameterized model $U \in \mathbb{R}^{d \times k}$ with $k \gg r$. We study the approximate local minima of the mean square error, augmented with a smooth version of a group Lasso regularizer, $\sum_{i=1}^k \| U e_i \|_2$. In particular, we provably show that pruning all the columns below a certain explicit $\ell_2$-norm threshold results in a solution $U_{\text{prune}}$ which has the minimum number of columns $r$, yet close to the ground truth in training loss. Moreover, in the subsequent fine-tuning phase, gradient descent initialized at $U_{\text{prune}}$ converges at a linear rate to its limit. While our analysis provides insights into the role of regularization in pruning, we also show that running gradient descent in the absence of regularization results in models which {are not suitable for greedy pruning}, i.e., many columns could have their $\ell_2$ norm comparable to that of the maximum. To the best of our knowledge, our results provide the first rigorous insights on why greedy pruning + fine-tuning leads to smaller models which also generalize well.

artificial intelligence, machine learning, stationary point, (17 more...)

arXiv.org Artificial Intelligence

2303.11453

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)

Add feedback

Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods

Nouiehed, Maher, Sanjabi, Maziar, Huang, Tianjian, Lee, Jason D., Razaviyayn, Meisam

Neural Information Processing SystemsMar-19-2020, 02:47:33 GMT

non-convex min-max game, order stationary point, varepsilon, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Efficiently avoiding saddle points with zero order methods: No gradients required

Flokas, Lampros, Vlatakis-Gkaragkounis, Emmanouil-Vasileios, Piliouras, Georgios

arXiv.org Machine LearningOct-28-2019

We consider the case of derivative-free algorithms for non-convex optimization, also known as zero order algorithms, that use only function evaluations rather than gradients. For a wide variety of gradient approximators based on finite differences, we establish asymptotic convergence to second order stationary points using a carefully tailored application of the Stable Manifold Theorem. Regarding efficiency, we introduce a noisy zero-order method that converges to second order stationary points, i.e avoids saddle points. Our algorithm uses only $\tilde{\mathcal{O}}(1 / \epsilon^2)$ approximate gradient calculations and, thus, it matches the converge rate guarantees of their exact gradient counterparts up to constants. In contrast to previous work, our convergence rate analysis avoids imposing additional dimension dependent slowdowns in the number of iterations required for non-convex zero order optimization.

gradient descent, saddle point, stationary point, (14 more...)

arXiv.org Machine Learning

1910.13021

Country:

Asia > Middle East > Jordan (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Spain > Canary Islands (0.04)
(14 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.30)

Add feedback

Escaping Saddle Points for Zeroth-order Nonconvex Optimization using Estimated Gradient Descent

Bai, Qinbo, Agarwal, Mridul, Aggarwal, Vaneet

arXiv.org Machine LearningOct-2-2019

Gradient descent and its variants are widely used in machine learning. However, oracle access of gradient may not be available in many applications, limiting the dire ct use of gradient descent. This paper proposes a method of estimating gradient to perform gradient descent, that converges to a stationary point for general non-convex optimization problems. Beyond the first-order stati onary properties, the second-order stationary properties are important in machine learning applications to achieve b etter performance. Gradient descent and its variants (e.g., Stochastic Gradie nt Descent) are widely used in machine learning due to their favorable computational properties, for examp le, in optimizing weights of a deep neural network. Recently, second order stationary guarant ees have been studied by using a perturbed version of gradient de scent [2].

algorithm, gradient, stationary point, (14 more...)

arXiv.org Machine Learning

1910.01277

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback