Export Reviews, Discussions, Author Feedback and Meta-Reviews
–Neural Information Processing Systems
Summary of paper and review: In this paper, the authors consider stochastic-gradient algorithms, and using an importance/weighted sampling scheme, they show how it is possible to attain faster convergence rates in certain regimes. In particular, for strongly convex problems, the authors show how--if one knows Lipschitz constants of every term in a finite sum objective--it is possible to attain convergence rates that depend not on a squared norm of Lipschitz constants but on a 1-norm-like quantity, which is always smaller. The downside of this approach is that one must know these Lipschitz constants, and it is difficult (perhaps impossible) to apply the results to objectives that are not of the from f(x) \sum_{i 1} n f_i(x). I am also not convinced that I should care to use these algorithms; the lack of empirical insights leaves me wondering if this analysis matters. Detailed comments: The idea here is a simple enough idea, and makes sense.
Neural Information Processing Systems
Feb-10-2025, 00:10:31 GMT