Reviews: Beating SGD Saturation with Tail-Averaging and Minibatching
–Neural Information Processing Systems
I'll keep my mark and vote for accepting this paper. Yet, the techniques for bounding each term seem borrowed and adapted from previous papers analyzing SGD for least-squares problems -related papers are adequately cited. Quality and clarity: Theoretically speaking, the paper is self-contained and provides proofs of all theorems and a clear discussion on all the assumptions made in the paper. Furthermore, despite the number of parameters concerned with the analysis, the main results (Theorem 1 and Corollary 1) are very clear and clearly compared with the relative work. However, the experimental section may lack of a real dataset where r can be computed and where we could see the difference between tail and uniform averaging.
Neural Information Processing Systems
Jan-23-2025, 13:10:02 GMT
- Technology: