Reviews: SEGA: Variance Reduction via Gradient Sketching

Neural Information Processing Systems 

In this paper, the authors propose a randomized first order optimization method (SEGA) which progressively builds a variance reduced estimate of the gradient from random linear measurements of the gradient. The proposed method (or class of methods - depending on the sketch matrix and metric used) updates the current estimate of the gradient through a sketch-and-project operation using new gradient information and the past estimate of the gradient. However, the quality of the paper deteriorates after page 6. The paper has minor typos and grammatical mistakes that can be corrected easily. The experiments are well though out to highlight certain algorithmic features of the method, however, several details are missing (e.g., what is the dimension n of the problems solved?), comparison with more methods would strengthen the claims made and experiments on real ML problems would highlight the merits (and limitations) of SEGA.