AITopics | sdca

Dimension-Free Iteration Complexity of Finite Sum Optimization Problems

Neural Information Processing SystemsMar-17-2026, 07:27:32 GMT

Many canonical machine learning problems boil down to a convex optimization problem with a finite sum structure. However, whereas much progress has been made in developing faster algorithms for this setting, the inherent limitations of these problems are not satisfactorily addressed by existing lower bounds. Indeed, current bounds focus on first-order optimization algorithms, and only apply in the often unrealistic regime where the number of iterations is less than $\cO(d/n)$ (where $d$ is the dimension and $n$ is the number of samples). In this work, we extend the framework of Arjevani et al. \cite{arjevani2015lower,arjevani2016iteration} to provide new lower bounds, which are dimension-free, and go beyond the assumptions of current bounds, thereby covering standard finite sum optimization methods, e.g., SAG, SAGA, SVRG, SDCA without duality, as well as stochastic coordinate-descent methods, such as SDCA and accelerated proximal SDCA.

machine learning, optimization problem, proceedings, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.79)

Add feedback

Breaking the Span Assumption Yields Fast Finite-Sum Minimization

Robert Hannah, Yanli Liu, Daniel O'Connor, Wotao Yin

Neural Information Processing SystemsFeb-14-2026, 02:47:19 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, available, sarah, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Accelerated Mini-Batch Stochastic Dual Coordinate Ascent

Neural Information Processing SystemsDec-27-2025, 15:04:54 GMT

Stochastic dual coordinate ascent (SDCA) is an effective technique for solving regularized loss minimization problems in machine learning. This paper considers an extension of SDCA under the mini-batch setting that is often used in practice. Our main contribution is to introduce an accelerated mini-batch version of SDCA and prove a fast convergence rate for this method. We discuss an implementation of our method over a parallel computing system, and compare the results to both the vanilla stochastic dual coordinate ascent and to the accelerated deterministic gradient descent method of Nesterov [2007].

accelerated mini-batch stochastic, mini-batch stochastic dual coordinate ascent, name change, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.82)

Add feedback

Dimension-Free Iteration Complexity of Finite Sum Optimization Problems

Neural Information Processing SystemsNov-21-2025, 14:27:01 GMT

Many canonical machine learning problems boil down to a convex optimization problem with a finite sum structure. However, whereas much progress has been made in developing faster algorithms for this setting, the inherent limitations of these problems are not satisfactorily addressed by existing lower bounds. Indeed, current bounds focus on first-order optimization algorithms, and only apply in the often unrealistic regime where the number of iterations is less than $\cO(d/n)$ (where $d$ is the dimension and $n$ is the number of samples). In this work, we extend the framework of Arjevani et al. \cite{arjevani2015lower,arjevani2016iteration} to provide new lower bounds, which are dimension-free, and go beyond the assumptions of current bounds, thereby covering standard finite sum optimization methods, e.g., SAG, SAGA, SVRG, SDCA without duality, as well as stochastic coordinate-descent methods, such as SDCA and accelerated proximal SDCA.

dimension-free iteration complexity, finite sum optimization problem, name change, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.79)

Add feedback

Breaking the Span Assumption Yields Fast Finite-Sum Minimization

Robert Hannah, Yanli Liu, Daniel O'Connor, Wotao Yin

Neural Information Processing SystemsNov-20-2025, 19:12:47 GMT

Most finite sum algorithms follow what we call the "span

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-2-2025, 23:02:14 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The authors propose an accelerated proximal block coordinate descent algorithm, describe its application to standard regularized loss minimization problems, and conclude with experiments on a smoothed SVM. On the question of clarity: I found the paper on the whole difficult to follow, with the authors showing a marked preference for writing equations in lieu of explanations. There are also numerous small grammatical errors. I'm not aware of other algorithms that are designed to work on block-coordinate problems (although single-coordinate algorithms are common enough), and have to question the advantage of this formulation, aside from being slightly more general. Given that the application considered in section 4 is single-coordinate (am I correct about this?), it might simplify the presentation to work from a single-coordinate formulation, and merely mention that block-coordinate updates are also possible.

algorithm, experiment, gradient method, (13 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.05)

Genre: Research Report > New Finding (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

Accelerated Mini-Batch Stochastic Dual Coordinate Ascent

Neural Information Processing SystemsSep-30-2025, 11:02:55 GMT

Stochastic dual coordinate ascent (SDCA) is an effective technique for solving regularized loss minimization problems in machine learning. This paper considers an extension of SDCA under the mini-batch setting that is often used in practice. Our main contribution is to introduce an accelerated mini-batch version of SDCA and prove a fast convergence rate for this method. We discuss an implementation of our method over a parallel computing system, and compare the results to both the vanilla stochastic dual coordinate ascent and to the accelerated deterministic gradient descent method of Nesterov [2007].

accelerated mini-batch stochastic, mini-batch stochastic dual coordinate ascent, name change, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.82)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsAug-19-2025, 07:09:50 GMT

Submitted by Assigned_Reviewer_1 Q1 The authors propose a non-uniform sampling scheme for variance reduced SGD type algorithms based on local smoothness and the fact that the gradient of many individual losses is constant. The authors show that such a scheme is able to outperform uniform sampling for SVRG and SDCA. Overall the idea is an interesting one and seems to perform well in practice. However, I feel that the paper has some major clarity issues. In general, I find the paper quite difficult to read.

algorithm, lipschitz constant, local smoothness, (13 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives

Aaron Defazio, Francis Bach, Simon Lacoste-Julien

Neural Information Processing SystemsFeb-9-2025, 23:57:33 GMT

In this work we introduce a new optimisation method called SAGA in the spirit of SAG, SDCA, MISO and SVRG, a set of recently proposed incremental gradient algorithms with fast linear convergence rates. SAGA improves on the theory behind SAG and SVRG, with better theoretical convergence rates, and has support for composite objectives where a proximal operator is used on the regulariser. Unlike SDCA, SAGA supports non-strongly convex problems directly, and is adaptive to any inherent strong convexity of the problem. We give experimental results showing the effectiveness of our method.

artificial intelligence, machine learning, saga, (14 more...)

Neural Information Processing Systems

Country: