Goto

Collaborating Authors

 Narayanamurthy, Praneeth


Fast Robust Subspace Tracking via PCA in Sparse Data-Dependent Noise

arXiv.org Machine Learning

This work studies the robust subspace tracking (ST) problem. Robust ST can be simply understood as a (slow) time-varying subspace extension of robust PCA. It assumes that the true data lies in a low-dimensional subspace that is either fixed or changes slowly with time. The goal is to track the changing subspaces over time in the presence of additive sparse outliers and to do this quickly (with a short delay). We introduce a ``fast'' mini-batch robust ST solution that is provably correct under mild assumptions. Here ``fast'' means two things: (i) the subspace changes can be detected and the subspaces can be tracked with near-optimal delay, and (ii) the time complexity of doing this is the same as that of simple (non-robust) PCA. Our main result assumes piecewise constant subspaces (needed for identifiability), but we also provide a corollary for the case when there is a little change at each time. A second contribution is a novel non-asymptotic guarantee for PCA in linearly data-dependent noise. An important setting where this result is useful is for linearly data-dependent noise that is sparse with enough support changes over time. The subspace update step of our proposed robust ST solution uses this result.


Phaseless Low Rank Matrix Recovery and Subspace Tracking

arXiv.org Machine Learning

Abstract--This work introduces the first simple and provably correct solution for recovering a low-rank matrix from phaseless (magnitude-only)linear projections of each of its columns. This problem finds important applications in phaseless dynamic imaging, e.g., Fourier ptychographic imaging of live biological specimens. We demonstrate the practical advantage of our proposed approach, AltMinLowRaP, over existing work via extensive simulation, and some real-data, experiments. We also provide a solution for a dynamic extension of the above problem. This allows the low-dimensional subspace from which each image/signal is generated to change with time in a piecewise constant fashion. I. INTRODUCTION In recent years, there has been a resurgence of interest in the classical "phase retrieval (PR)" problem [1], [2]. These are commonly referred to as phaseless linear projections of the unknown signal. While practical PR methods have existed for a long time, e.g., see [1], [2], the focus of the recent work has been on obtaining correctness guarantees for these and newer algorithms. This line of work includes convex relaxation methods [3], [4] as well as non-convex methods [5], [6], [7], [8], [9]. It is easy to see that, without extra assumptions, PR requires m n. The best known guarantees - see [7] and followup works - prove exact recovery with high probability (whp) with order-optimal number of measurements/samples: m Cn.


Subspace Tracking from Missing and Outlier Corrupted Data

arXiv.org Machine Learning

We study the related problems of subspace tracking in the presence of missing data (ST-miss) as well as robust subspace tracking with missing data (RST-miss). Here "robust" refers to robustness to sparse outliers. In recent work, we have studied the RST problem without missing data. In this work, we show that simple modifications of our solution approach for RST also provably solve ST-miss and RST-miss under weaker and similar assumptions respectively. To our knowledge, our result is the first complete guarantee for both ST-miss and RST-miss. This means we are able to show that, under assumptions on only the algorithm inputs (input data and/or initialization), the output subspace estimates are close to the true data subspaces at all times. Our guarantees hold under mild and easily interpretable assumptions and handle time-varying subspaces (unlike all previous work). We also show that our algorithm and its extensions are fast and have competitive experimental performance when compared with existing methods.


Nearly Optimal Robust Subspace Tracking

arXiv.org Machine Learning

In this work, we study the robust subspace tracking (RST) problem and obtain one of the first two provable guarantees for it. The goal of RST is to track sequentially arriving data vectors that lie in a slowly changing low-dimensional subspace, while being robust to corruption by additive sparse outliers. It can also be interpreted as a dynamic (time-varying) extension of robust PCA (RPCA), with the minor difference that RST also requires a short tracking delay. We develop a recursive projected compressive sensing algorithm that we call Nearly Optimal RST via ReProCS (ReProCS-NORST) because its tracking delay is nearly optimal. We prove that NORST solves both the RST and the dynamic RPCA problems under weakened standard RPCA assumptions, two simple extra assumptions (slow subspace change and most outlier magnitudes lower bounded), and a few minor assumptions. Our guarantee shows that NORST enjoys a near optimal tracking delay of $O(r \log n \log(1/\epsilon))$. Its required delay between subspace change times is the same, and its memory complexity is $n$ times this value. Thus both these are also nearly optimal. Here $n$ is the ambient space dimension, $r$ is the subspaces' dimension, and $\epsilon$ is the tracking accuracy. NORST also has the best outlier tolerance compared with all previous RPCA or RST methods, both theoretically and empirically (including for real videos), without requiring any model on how the outlier support is generated. This is possible because of the extra assumptions it uses.


Static and Dynamic Robust PCA via Low-Rank + Sparse Matrix Decomposition: A Review

arXiv.org Machine Learning

Principal Components Analysis (PCA) is one of the most widely used dimension reduction techniques. Robust PCA (RPCA) refers to the problem of PCA when the data may be corrupted by outliers. Recent work by Candes, Wright, Li, and Ma defined RPCA as a problem of decomposing a given data matrix into the sum of a low-rank matrix (true data) and a sparse matrix (outliers). The column space of the low-rank matrix then gives the PCA solution. This simple definition has lead to a large amount of interesting new work on provably correct, fast, and practically useful solutions to the RPCA problem. More recently, the dynamic (time-varying) version of the RPCA problem has been studied and a series of provably correct, fast, and memory efficient tracking solutions have been proposed. Dynamic RPCA (or robust subspace tracking) is the problem of tracking data lying in a (slowly) changing subspace while being robust to sparse outliers. This article provides an exhaustive review of the last decade of literature on RPCA and its dynamic counterpart (robust subspace tracking), along with describing their theoretical guarantees, discussing the pros and cons of various approaches, and providing empirical comparisons of performance and speed.


Robust PCA, Subspace Learning, and Tracking

arXiv.org Machine Learning

PCA is one of the most widely used dimension reduction techniques. A related easier problem is "subspace learning" or "subspace estimation". Given relatively clean data, both are easily solved via singular value decomposition (SVD). The problem of subspace learning or PCA in the presence of outliers is called robust subspace learning or robust PCA (RPCA). For long data sequences, if one tries to use a single lower dimensional subspace to represent the data, the required subspace dimension may end up being quite large. For such data, a better model is to assume that it lies in a low-dimensional subspace that can change over time, albeit gradually. The problem of tracking such data (and the subspaces) while being robust to outliers is called robust subspace tracking (RST). This article provides a magazine-style overview of the entire field of robust subspace learning and tracking. In particular solutions for three problems are discussed in detail: RPCA via sparse+low-rank matrix decomposition (S+LR), RST via S+LR, and "robust subspace recovery". This assumes that an entire data vector is either an outlier or an inlier. The S+LR formulation instead assumes that outliers occur on only a few data vector indices and hence are well modeled as sparse corruptions.


Provable Dynamic Robust PCA or Robust Subspace Tracking

arXiv.org Machine Learning

Dynamic robust PCA refers to the dynamic (time-varying) extension of the robust PCA (RPCA) problem. It assumes that the true (uncorrupted) data lies in a low-dimensional subspace that can change with time, albeit slowly. The goal is to track this changing subspace over time in the presence of sparse outliers. This work provides the first guarantee for dynamic RPCA that holds under weakened versions of standard RPCA assumptions and a few other simple assumptions. We analyze a novel algorithm based on the recently introduced Recursive Projected Compressive Sensing (ReProCS) framework. Our result is significant because (i) it removes the strong assumptions needed by the two previous complete guarantees for ReProCS-based algorithms; (ii) it shows that, it is possible to achieve significantly improved outlier tolerance by exploiting slow subspace change and a lower bound on most outlier magnitudes; and (iii) it proves that the proposed algorithm is online (after initialization), fast, and, has near-optimal storage complexity.


Finite Sample Guarantees for PCA in Non-Isotropic and Data-Dependent Noise

arXiv.org Machine Learning

This work obtains novel finite sample guarantees for Principal Component Analysis (PCA). These hold even when the corrupting noise is non-isotropic, and a part (or all of it) is data-dependent. Because of the latter, in general, the noise and the true data are correlated. The results in this work are a significant improvement over those given in our earlier work where this "correlated-PCA" problem was first studied. In fact, in certain regimes, our results imply that the sample complexity required to achieve subspace recovery error that is a constant fraction of the noise level is near-optimal. Useful corollaries of our result include guarantees for PCA in sparse data-dependent noise and for PCA with missing data. An important application of the former is in proving correctness of the subspace update step of a popular online algorithm for dynamic robust PCA.