Barzilai-Borwein Step Size for Stochastic Gradient Descent

Conghui Tan, Shiqian Ma, Yu-Hong Dai, Yuqiu Qian

Neural Information Processing Systems 

SGD that can reduce the variance and improve the complexity.