Goto

Collaborating Authors

 Principal Component Analysis


Robust Sub-Gaussian Principal Component Analysis and Width-Independent Schatten Packing

Neural Information Processing Systems

We develop two methods for the following fundamental statistical task: given an ɛ-corrupted set of n samples from a d-dimensional sub-Gaussian distribution, return an approximate top eigenvector of the covariance matrix.


Review for NeurIPS paper: Robust Sub-Gaussian Principal Component Analysis and Width-Independent Schatten Packing

Neural Information Processing Systems

Two new methods for this problem are proposed, one of which uses width-independent Schatten packing SDPs. Reviewers agree that this is an interesting, non-trivial and solid theoretical work and should be accepted for NeurIPS. The rebuttal addressed the reviewers concerns adequately. The recommendation is to accept this paper for presentation at NeurIPS. We urge the authors to make the connection of the Schatten packing to the main approach more clearer in a final version of the paper.


Reviews: Manifold denoising by Nonlinear Robust Principal Component Analysis

Neural Information Processing Systems

After rebuttal: I would like to thank the authors for addressing the comments. Their responses helped clarify some of my questions and helped better understand the paper, and therefore I am glad to increase my score. I am glad that the authors have decided to add the derivations in Sect. There are a few other things that I hope the authors will address for the final version of the paper: 1. Limitations of the method 2. Short introduction to RPCA 3. Related work as mentioned in the initial review. The discussion on kNN vs \epsNN on stability from the authors' response is very helpful and would be useful to add it to the paper, otherwise it's still not clear why wouldn't the method use all the points in \epsNN (within radius r1).


Manifold Denoising by Nonlinear Robust Principal Component Analysis

Neural Information Processing Systems

This paper extends robust principal component analysis (RPCA) to nonlinear manifolds. Suppose that the observed data matrix is the sum of a sparse component and a component drawn from some low dimensional manifold. Is it possible to separate them by using similar ideas as RPCA? Is there any benefit in treating the manifold as a whole as opposed to treating each local region independently? We answer these two questions affirmatively by proposing and analyzing an optimization framework that separates the sparse component from the manifold under noisy data. Theoretical error bounds are provided when the tangent spaces of the manifold satisfy certain incoherence conditions. We also provide a near optimal choice of the tuning parameters for the proposed optimization formulation with the help of a new curvature estimation method. The efficacy of our method is demonstrated on both synthetic and real datasets.



Quantum Annealing for Robust Principal Component Analysis

arXiv.org Machine Learning

Principal component analysis is commonly used for dimensionality reduction, feature extraction, denoising, and visualization. The most commonly used principal component analysis method is based upon optimization of the L2-norm, however, the L2-norm is known to exaggerate the contribution of errors and outliers. When optimizing over the L1-norm, the components generated are known to exhibit robustness or resistance to outliers in the data. The L1-norm components can be solved for with a binary optimization problem. Previously, L1-BF has been used to solve the binary optimization for multiple components simultaneously. In this paper we propose QAPCA, a new method for finding principal components using quantum annealing hardware which will optimize over the robust L1-norm. The conditions required for convergence of the annealing problem are discussed. The potential speedup when using quantum annealing is demonstrated through complexity analysis and experimental results. To showcase performance against classical principal component analysis techniques experiments upon synthetic Gaussian data, a fault detection scenario and breast cancer diagnostic data are studied. We find that the reconstruction error when using QAPCA is comparable to that when using L1-BF.


Reviews: Robust Principal Component Analysis with Adaptive Neighbors

Neural Information Processing Systems

Update: Thanks for the feedback and I have read them. Yet I don't think it has convinced me to change my decision. For Q2, if the framework is general, the authors should have extended it more than one case. Otherwise, the authors should focus on PCA instead of claiming the framework to be general. For Q3 and Q4, I think the discussion on how to choose k and d is not sufficient in the paper.


Robust Principal Component Analysis with Adaptive Neighbors

Neural Information Processing Systems

Suppose certain data points are overly contaminated, then the existing principal component analysis (PCA) methods are frequently incapable of filtering out and eliminating the excessively polluted ones, which potentially lead to the functional degeneration of the corresponding models. To tackle the issue, we propose a general framework namely robust weight learning with adaptive neighbors (RWL-AN), via which adaptive weight vector is automatically obtained with both robustness and sparse neighbors. More significantly, the degree of the sparsity is steerable such that only exact k well-fitting samples with least reconstruction errors are activated during the optimization, while the residual samples, i.e., the extreme noised ones are eliminated for the global robustness. Additionally, the framework is further applied to PCA problem to demonstrate the superiority and effectiveness of the proposed RWL-AN model.


Reviews: Robust Principal Component Analysis with Adaptive Neighbors

Neural Information Processing Systems

The reviews were mixed, but given the competitive nature of the conference, this paper probably doesn't make the threshold. Since the paper deals with adaptive dimensionality reduction, the following paper seems quite relevant: Lee-Ad Gottlieb, Aryeh Kontorovich, Robert Krauthgamer: Adaptive metric dimensionality reduction.


Review for NeurIPS paper: Estimation and Imputation in Probabilistic Principal Component Analysis with Missing Not At Random Data

Neural Information Processing Systems

Additional Feedback: I have read the other reviews and the authors' feedback. With the addition of the recommender system experiment, walking the readers through how (1) the MNAR and PPCA model apply in this setting, (2) selecting the hyper-parameters for the imputation algorithm, (3) showing how the imputations compare with prior algorithms, helps make a strong case for the proposed method. If the authors re-arrange the paper to improve clarity (as the reviews point out, and as they promise in their feedback), the paper can be substantially stronger. There are a few lingering questions from the reviews that the authors should address in the paper at a minimum -- (1) a discussion on a stage-wise approach to imputation (and why that may not be necessary for their sequence of regressions), (2) given that some of the linear coefficients can be zero, what must a practitioner do when one of the regressions estimate a coefficient close to 0 that is then used in the denominator of other estimates. Even better if the illustration is grounded in an example like movie item ratings.