Scalable Adaptive Stochastic Optimization Using Random Projections Gabriel Krummenacher gabriel.krummenacher@inf.ethz.ch Brian McWilliams

Neural Information Processing Systems 

The most commonly studied and utilised version considers only a diagonal matrix proximal term.