Goto

Collaborating Authors

 robust sub-gaussian principal component analysis


Review for NeurIPS paper: Robust Sub-Gaussian Principal Component Analysis and Width-Independent Schatten Packing

Neural Information Processing Systems

Summary and Contributions: This paper studies the problem of Robust PCA of a subgaussian distribution. Specifically, one is given samples X_1,X_2,...,X_n from a subgaussian distribution, such that an eps-fraction of the samples have been arbitrarily corruption (modified by an adversary), the goal is to approximately recover the top singular vector of the covariance matrix Sigma. Here, approximately recover means to find a vector u such that u T Sigma u (1 - gamma) Sigma _op, where Sigma _op is the operator norm of Sigma. The main result of this paper are runtime and sample complexity efficient algorithms for this task. Specifically, they show 1) An algorithm that achieves error gamma O(eps log(1/eps)) in polynomial time, specifically in tilde{O}(n d 2/eps) time, using n Omega(d /*(eps log(1/eps)) 2) samples.


Review for NeurIPS paper: Robust Sub-Gaussian Principal Component Analysis and Width-Independent Schatten Packing

Neural Information Processing Systems

Two new methods for this problem are proposed, one of which uses width-independent Schatten packing SDPs. Reviewers agree that this is an interesting, non-trivial and solid theoretical work and should be accepted for NeurIPS. The rebuttal addressed the reviewers concerns adequately. The recommendation is to accept this paper for presentation at NeurIPS. We urge the authors to make the connection of the Schatten packing to the main approach more clearer in a final version of the paper.


Robust Sub-Gaussian Principal Component Analysis and Width-Independent Schatten Packing

Neural Information Processing Systems

We develop two methods for the following fundamental statistical task: given an \eps -corrupted set of n samples from a d -dimensional sub-Gaussian distribution, return an approximate top eigenvector of the covariance matrix. Our first robust PCA algorithm runs in polynomial time, returns a 1 - O(\eps\log\eps {-1}) -approximate top eigenvector, and is based on a simple iterative filtering approach. Our second, which attains a slightly worse approximation factor, runs in nearly-linear time and sample complexity under a mild spectral gap assumption. These are the first polynomial-time algorithms yielding non-trivial information about the covariance of a corrupted sub-Gaussian distribution without requiring additional algebraic structure of moments. As a key technical tool, we develop the first width-independent solvers for Schatten- p norm packing semidefinite programs, giving a (1 \eps) -approximate solution in O(p\log(\tfrac{nd}{\eps})\eps {-1}) input-sparsity time iterations (where n, d are problem dimensions).