Robust PCA with compressed data

Neural Information Processing Systems 

The robust principal component analysis (RPCA) problem seeks to separate lowrank trends from sparse outliers within a data matrix, that is, to approximate a n d matrix D as the sum of a low-rank matrix L and a sparse matrix S. We examine the robust principal component analysis (RPCA) problem under data compression, where the data Y is approximately given by (L+S) C, that is, a low-rank + sparse data matrix that has been compressed to size n m (with m substantially smaller than the original dimension d) via multiplication with a compression matrix C. We give a convex program for recovering the sparse component S along with the compressed low-rank component L C, along with upper bounds on the error of this reconstruction that scales naturally with the compression dimension m and coincides with existing results for the uncompressed setting m = d. Our results can also handle error introduced through additive noise or through missing data. The scaling of dimension, compression, and signal complexity in our theoretical results is verified empirically through simulations, and we also apply our method to a data set measuring chlorine concentration across a network of sensors to test its performance in practice.