AITopics | proxgd

Adaptive Proximal Gradient Method for Convex Optimization

Neural Information Processing SystemsMar-22-2026, 04:39:22 GMT

In this paper, we explore two fundamental first-order algorithms in convex optimization, namely, gradient descent (GD) and proximal gradient method (ProxGD). Our focus is on making these algorithms entirely adaptive by leveraging local curvature information of smooth functions. We propose adaptive versions of GD and ProxGD that are based on observed gradient differences and, thus, have no added computational costs. Moreover, we prove convergence of our methods assuming only local Lipschitzness of the gradient. In addition, the proposed versions allow for even larger stepsizes than those initially suggested in [MM20].

artificial intelligence, machine learning, proceedings, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.43)

Add feedback

A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization

Zhize Li, Jian Li

Neural Information Processing SystemsFeb-15-2026, 01:52:50 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, minibatch size, pl condition, (12 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.52)

Add feedback

A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization

Zhize Li, Jian Li

Neural Information Processing SystemsNov-20-2025, 20:53:22 GMT

In particular, ProxSVRG+ generalizes the best results given by the SCSG algorithm, recently proposed by [Lei et al.,

artificial intelligence, machine learning, pl condition, (15 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.52)

Add feedback

Adaptive Proximal Gradient Method for Convex Optimization

Neural Information Processing SystemsMay-27-2025, 13:46:53 GMT

In this paper, we explore two fundamental first-order algorithms in convex optimization, namely, gradient descent (GD) and proximal gradient method (ProxGD). Our focus is on making these algorithms entirely adaptive by leveraging local curvature information of smooth functions. We propose adaptive versions of GD and ProxGD that are based on observed gradient differences and, thus, have no added computational costs. Moreover, we prove convergence of our methods assuming only local Lipschitzness of the gradient. In addition, the proposed versions allow for even larger stepsizes than those initially suggested in [MM20].

adaptive proximal gradient method, convex optimization, proxgd

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.52)

Add feedback

Iterative Approximate Cross-Validation

Luo, Yuetian, Ren, Zhimei, Barber, Rina Foygel

arXiv.org Machine LearningMay-27-2023

Cross-validation (CV) is one of the most popular tools for assessing and selecting predictive models. However, standard CV suffers from high computational cost when the number of folds is large. Recently, under the empirical risk minimization (ERM) framework, a line of works proposed efficient methods to approximate CV based on the solution of the ERM problem trained on the full dataset. However, in large-scale problems, it can be hard to obtain the exact solution of the ERM problem, either due to limited computational resources or due to early stopping as a way of preventing overfitting. In this paper, we propose a new paradigm to efficiently approximate CV when the ERM problem is solved via an iterative first-order algorithm, without running until convergence. Our new method extends existing guarantees for CV approximation to hold along the whole trajectory of the algorithm, including at convergence, thus generalizing existing CV approximation methods. Finally, we illustrate the accuracy and computational efficiency of our method through a range of empirical studies.

artificial intelligence, assumption, machine learning, (16 more...)

arXiv.org Machine Learning

2303.02732

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.61)

Add feedback

A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization

Li, Zhize, Li, Jian

Neural Information Processing SystemsDec-31-2018

We analyze stochastic gradient algorithms for optimizing nonconvex, nonsmooth finite-sum problems. In particular, the objective function is given by the summation of a differentiable (possibly nonconvex) component, together with a possibly non-differentiable but convex component. We propose a proximal stochastic gradient algorithm based on variance reduction, called ProxSVRG+. Our main contribution lies in the analysis of ProxSVRG+. It recovers several existing convergence results and improves/generalizes them (in terms of the number of stochastic gradient oracle calls and proximal oracle calls). In particular, ProxSVRG+ generalizes the best results given by the SCSG algorithm, recently proposed by [Lei et al., NIPS'17] for the smooth nonconvex case. ProxSVRG+ is also more straightforward than SCSG and yields simpler analysis. Moreover, ProxSVRG+ outperforms the deterministic proximal gradient descent (ProxGD) for a wide range of minibatch sizes, which partially solves an open problem proposed in [Reddi et al., NIPS'16]. Also, ProxSVRG+ uses much less proximal oracle calls than ProxSVRG [Reddi et al., NIPS'16]. Moreover, for nonconvex functions satisfied Polyak-\L{}ojasiewicz condition, we prove that ProxSVRG+ achieves a global linear convergence rate without restart unlike ProxSVRG. Thus, it can \emph{automatically} switch to the faster linear convergence in some regions as long as the objective function satisfies the PL condition locally in these regions. Finally, we conduct several experiments and the experimental results are consistent with the theoretical results.

artificial intelligence, machine learning, proxsvrg, (15 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization

Li, Zhize, Li, Jian

Neural Information Processing SystemsDec-31-2018

We analyze stochastic gradient algorithms for optimizing nonconvex, nonsmooth finite-sum problems. In particular, the objective function is given by the summation of a differentiable (possibly nonconvex) component, together with a possibly non-differentiable but convex component. We propose a proximal stochastic gradient algorithm based on variance reduction, called ProxSVRG+. Our main contribution lies in the analysis of ProxSVRG+. It recovers several existing convergence results and improves/generalizes them (in terms of the number of stochastic gradient oracle calls and proximal oracle calls). In particular, ProxSVRG+ generalizes the best results given by the SCSG algorithm, recently proposed by [Lei et al., NIPS'17] for the smooth nonconvex case. ProxSVRG+ is also more straightforward than SCSG and yields simpler analysis. Moreover, ProxSVRG+ outperforms the deterministic proximal gradient descent (ProxGD) for a wide range of minibatch sizes, which partially solves an open problem proposed in [Reddi et al., NIPS'16]. Also, ProxSVRG+ uses much less proximal oracle calls than ProxSVRG [Reddi et al., NIPS'16]. Moreover, for nonconvex functions satisfied Polyak-\L{}ojasiewicz condition, we prove that ProxSVRG+ achieves a global linear convergence rate without restart unlike ProxSVRG. Thus, it can \emph{automatically} switch to the faster linear convergence in some regions as long as the objective function satisfies the PL condition locally in these regions. Finally, we conduct several experiments and the experimental results are consistent with the theoretical results.

artificial intelligence, machine learning, proxsvrg, (15 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization

Li, Zhize, Li, Jian

arXiv.org Machine LearningFeb-13-2018

We analyze stochastic gradient algorithms for optimizing nonconvex, nonsmooth finite-sum problems. In particular, the objective function is given by the summation of a differentiable (possibly nonconvex) component, together with a possibly non-differentiable but convex component. We propose a proximal stochastic gradient algorithm based on variance reduction, called ProxSVRG+. The algorithm is a slight variant of the ProxSVRG algorithm [Reddi et al., 2016b]. Our main contribution lies in the analysis of ProxSVRG+. It recovers several existing convergence results (in terms of the number of stochastic gradient oracle calls and proximal operations), and improves/generalizes some others. In particular, ProxSVRG+ generalizes the best results given by the SCSG algorithm, recently proposed by [Lei et al., 2017] for the smooth nonconvex case. ProxSVRG+ is more straightforward than SCSG and yields simpler analysis. Moreover, ProxSVRG+ outperforms the deterministic proximal gradient descent (ProxGD) for a wide range of minibatch sizes, which partially solves an open problem proposed in [Reddi et al., 2016b]. Finally, for nonconvex functions satisfied Polyak-{\L}ojasiewicz condition, we show that ProxSVRG+ achieves global linear convergence rate without restart. ProxSVRG+ is always no worse than ProxGD and ProxSVRG/SAGA, and sometimes outperforms them (and generalizes the results of SCSG) in this case.

artificial intelligence, machine learning, proxsvrg, (18 more...)

arXiv.org Machine Learning

1802.04477

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Filters

Collaborating Authors

proxgd

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Adaptive Proximal Gradient Method for Convex Optimization

A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization

A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization

Adaptive Proximal Gradient Method for Convex Optimization

Iterative Approximate Cross-Validation

A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization

A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization

A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization