Reviews: Exploiting the Structure: Stochastic Gradient Methods Using Raw Clusters
–Neural Information Processing Systems
The initial motivation seems to be the work of Hoffman et al on the use of clustering to speedup stochastic methods for ERM. Their method was not proved to converge to the optimal due to the use of biased stochastic gradients. Also, that work seemed to work only for small clusters due to the approach chosen. This papers goes a long way to develop the basic idea into a satisfying theoretical framework which also gives rise to efficient implementations. This paper is truly a pleasure to read – a very fine example of academic exposition.
Neural Information Processing Systems
Jan-20-2025, 10:46:56 GMT