Accelerating Stochastic Composition Optimization

Mengdi Wang, Ji Liu, Ethan Fang

Neural Information Processing Systems 

The popular stochastic gradient methods are well suited for minimizing expected-value objective functions or the sum of a large number of loss functions. Stochastic gradient methods find wide applications in estimation, online learning, and training of deep neural networks.