Random Reshuffling: Simple Analysis with Vast Improvements

Dec-24-2025, 14:51:11 GMT–Neural Information Processing Systems

Random Reshuffling (RR) is an algorithm for minimizing finite-sum functions that utilizes iterative gradient descent steps in conjunction with data reshuffling. Often contrasted with its sibling Stochastic Gradient Descent (SGD), RR is usually faster in practice and enjoys significant popularity in convex and non-convex optimization. The convergence rate of RR has attracted substantial attention recently and, for strongly convex and smooth functions, it was shown to converge faster than SGD if 1) the stepsize is small, 2) the gradients are bounded, and 3) the number of epochs is large. We remove these 3 assumptions, improve the dependence on the condition number from $\kappa^2$ to $\kappa$ (resp.\

name change, random reshuffling, simple analysis, (5 more...)

Neural Information Processing Systems

Dec-24-2025, 14:51:11 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.81)