Accelerating SGD for Highly Ill-Conditioned Huge-Scale Online Matrix Completion

Neural Information Processing Systems 

The matrix completion problem seeks to recover a d\times d ground truth matrix of low rank r\ll d from observations of its individual elements. Real-world matrix completion is often a huge-scale optimization problem, with d so large that even the simplest full-dimension vector operations with O(d) time complexity become prohibitively expensive. Stochastic gradient descent (SGD) is one of the few algorithms capable of solving matrix completion on a huge scale, and can also naturally handle streaming data over an evolving ground truth. Unfortunately, SGD experiences a dramatic slow-down when the underlying ground truth is ill-conditioned; it requires at least O(\kappa\log(1/\epsilon)) iterations to get \epsilon -close to ground truth matrix with condition number \kappa . In this paper, we propose a preconditioned version of SGD that preserves all the favorable practical qualities of SGD for huge-scale online optimization while also making it agnostic to \kappa .