Compressing Gradient Optimizers via Count-Sketches

Open in new window