AITopics | ssrgd

Collaborating Authors

ssrgd

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points

Zhize Li

Neural Information Processing SystemsFeb-13-2026, 13:56:58 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, gradient, stationary point, (13 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California (0.04)
North America > Canada (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points

Neural Information Processing SystemsDec-25-2025, 21:01:49 GMT

algorithm, simple stochastic recursive gradient descent, stationary point, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Add feedback

SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points

Zhize Li

Neural Information Processing SystemsAug-19-2025, 23:18:41 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, gradient, stationary point, (13 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Reviews: SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points

Neural Information Processing SystemsJan-26-2025, 10:25:51 GMT

After response----I keep my evaluation on the technical innovation and suboptimality of this paper. The basic Spider and SpiderBoost algorithms are both for first-order stationary point, they are almost the same, and both give n {1/2} rate. The simple way to modify both algorithms to escape saddle point is to add Negative Curvature Search (NCS) subroutine (which can be done in a very modular way, and is already shown in the Spider paper). I'd say it's almost trivial to also show SpiderBoost NCS to find second-order stationary point with n {1/2} rate. Comparing this paper with SpiderBoost NCS, there's no improvement from n {2/3} to n {1/2} (since Spiderboost is already n {1/2}), no simplification of Spider (as Spiderboost already did so). The only difference is replacing NCS by perturbations, which again requires some work, but most techniques are already there.

escape saddle point, escaping saddle point, simple stochastic recursive gradient descent, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.40)

Add feedback

Reviews: SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points

Neural Information Processing SystemsJan-26-2025, 10:25:41 GMT

After extensive back and forth discussion by the reviewers, ultimately we felt that despite trivial solutions that are possible by mixing spider with negative curvature search, the approach of the present paper has its usefulness (as noted in the reviews too), and that the paper can be accepted. The authors are, however, strongly encouraged to look into the reviews carefully, and implement the points mentioned in the rebuttal -- because it would be a pity if after all the back-n-forth that this paper's review cycle has witnessed, if the authors did not update the paper to clarify its contribution, its value, and its contrasts with less implementable methods.

escaping saddle point, simple stochastic recursive gradient descent, ssrgd

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.40)

Add feedback

SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points

Neural Information Processing SystemsOct-10-2024, 17:49:27 GMT

We analyze stochastic gradient algorithms for optimizing nonconvex problems. In particular, our goal is to find local minima (second-order stationary points) instead of just finding first-order stationary points which may be some bad unstable saddle points. We show that a simple perturbed version of stochastic recursive gradient descent algorithm (called SSRGD) can find an (\epsilon,\delta) -second-order stationary point with \widetilde{O}(\sqrt{n}/\epsilon 2 \sqrt{n}/\delta 4 n/\delta 3) stochastic gradient complexity for nonconvex finite-sum problems. As a by-product, SSRGD finds an \epsilon -first-order stationary point with O(n \sqrt{n}/\epsilon 2) stochastic gradients. These results are almost optimal since Fang et al. [2018] provided a lower bound \Omega(\sqrt{n}/\epsilon 2) for finding even just an \epsilon -first-order stationary point.

algorithm, simple stochastic recursive gradient descent, stationary point, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Progress in Nonsmooth Optimization part4(Machine Learning)

#artificialintelligenceMar-15-2023, 07:00:21 GMT

Abstract:: We propose and analyze several stochastic gradient algorithms for finding stationary points or local minimum in nonconvex, possibly with nonsmooth regularizer, finite-sum and online optimization problems. First, we propose a simple proximal stochastic gradient algorithm based on variance reduction called ProxSVRG . We provide a clean and tight analysis of ProxSVRG, which shows that it outperforms the deterministic proximal gradient descent (ProxGD) for a wide range of minibatch sizes, hence solves an open problem proposed in Reddi et al. (2016b). Also, ProxSVRG uses much less proximal oracle calls than ProxSVRG (Reddi et al., 2016b) and extends to the online setting by avoiding full gradient computations. Then, we further propose an optimal algorithm, called SSRGD, based on SARAH (Nguyen et al., 2017) and show that SSRGD further improves the gradient complexity of ProxSVRG and achieves the optimal upper bound, matching the known lower bound of (Fang et al., 2018; Li et al., 2021).

algorithm, machine learning, nonsmooth optimization part4, (10 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.80)

Add feedback

Simple and Optimal Stochastic Gradient Methods for Nonsmooth Nonconvex Optimization

Li, Zhize, Li, Jian

arXiv.org Artificial IntelligenceAug-21-2022

We propose and analyze several stochastic gradient algorithms for finding stationary points or local minimum in nonconvex, possibly with nonsmooth regularizer, finite-sum and online optimization problems. First, we propose a simple proximal stochastic gradient algorithm based on variance reduction called ProxSVRG+. We provide a clean and tight analysis of ProxSVRG+, which shows that it outperforms the deterministic proximal gradient descent (ProxGD) for a wide range of minibatch sizes, hence solves an open problem proposed in Reddi et al. (2016b). Also, ProxSVRG+ uses much less proximal oracle calls than ProxSVRG (Reddi et al., 2016b) and extends to the online setting by avoiding full gradient computations. Then, we further propose an optimal algorithm, called SSRGD, based on SARAH (Nguyen et al., 2017) and show that SSRGD further improves the gradient complexity of ProxSVRG+ and achieves the optimal upper bound, matching the known lower bound of (Fang et al., 2018; Li et al., 2021). Moreover, we show that both ProxSVRG+ and SSRGD enjoy automatic adaptation with local structure of the objective function such as the Polyak-\L{}ojasiewicz (PL) condition for nonconvex functions in the finite-sum case, i.e., we prove that both of them can automatically switch to faster global linear convergence without any restart performed in prior work ProxSVRG (Reddi et al., 2016b). Finally, we focus on the more challenging problem of finding an $(\epsilon, \delta)$-local minimum instead of just finding an $\epsilon$-approximate (first-order) stationary point (which may be some bad unstable saddle points). We show that SSRGD can find an $(\epsilon, \delta)$-local minimum by simply adding some random perturbations. Our algorithm is almost as simple as its counterpart for finding stationary points, and achieves similar optimal rates.

algorithm, epoch, probability, (15 more...)

arXiv.org Artificial Intelligence

2208.10025

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > California (0.04)
(3 more...)

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points

Li, Zhize

Neural Information Processing SystemsMar-18-2020, 21:01:55 GMT

We analyze stochastic gradient algorithms for optimizing nonconvex problems. In particular, our goal is to find local minima (second-order stationary points) instead of just finding first-order stationary points which may be some bad unstable saddle points. We show that a simple perturbed version of stochastic recursive gradient descent algorithm (called SSRGD) can find an $(\epsilon,\delta)$-second-order stationary point with $\widetilde{O}(\sqrt{n}/\epsilon 2 \sqrt{n}/\delta 4 n/\delta 3)$ stochastic gradient complexity for nonconvex finite-sum problems. As a by-product, SSRGD finds an $\epsilon$-first-order stationary point with $O(n \sqrt{n}/\epsilon 2)$ stochastic gradients. These results are almost optimal since Fang et al. [2018] provided a lower bound $\Omega(\sqrt{n}/\epsilon 2)$ for finding even just an $\epsilon$-first-order stationary point.

algorithm, simple stochastic recursive gradient descent, stationary point, (8 more...)

Neural Information Processing Systems

Industry:

Transportation > Passenger (0.40)
Transportation > Ground > Road (0.40)
Automobiles & Trucks > Manufacturer (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback