Goto

Collaborating Authors

Nonconvex Low-Rank Matrix Recovery with Arbitrary Outliers via Median-Truncated Gradient Descent

arXiv.org Machine Learning

Recent work has demonstrated the effectiveness of gradient descent for directly recovering the factors of low-rank matrices from random linear measurements in a globally convergent manner when initialized properly. However, the performance of existing algorithms is highly sensitive in the presence of outliers that may take arbitrary values. In this paper, we propose a truncated gradient descent algorithm to improve the robustness against outliers, where the truncation is performed to rule out the contributions of samples that deviate significantly from the {\em sample median} of measurement residuals adaptively in each iteration. We demonstrate that, when initialized in a basin of attraction close to the ground truth, the proposed algorithm converges to the ground truth at a linear rate for the Gaussian measurement model with a near-optimal number of measurements, even when a constant fraction of the measurements are arbitrarily corrupted. In addition, we propose a new truncated spectral method that ensures an initialization in the basin of attraction at slightly higher requirements. We finally provide numerical experiments to validate the superior performance of the proposed approach.



Memory-efficient Kernel PCA via Partial Matrix Sampling and Nonconvex Optimization: a Model-free Analysis of Local Minima

arXiv.org Machine Learning

Kernel PCA is a widely used nonlinear dimension reduction technique in machine learning, but storing the kernel matrix is notoriously challenging when the sample size is large. Inspired by Yi et al. [2016], where the idea of partial matrix sampling followed by nonconvex optimization is proposed for matrix completion and robust PCA, we apply a similar approach to memory-efficient Kernel PCA. In theory, with no assumptions on the kernel matrix in terms of eigenvalues or eigenvectors, we established a model-free theory for the low-rank approximation based on any local minimum of the proposed objective function. As interesting byproducts, when the underlying positive semidefinite matrix is assumed to be low-rank and highly structured, corollaries of our main theorem improve the state-of-the-art results of Ge et al. [2016, 2017] for nonconvex matrix completion with no spurious local minima. Numerical experiments also show that our approach is competitive in terms of approximation accuracy compared to the well-known Nystr\"{o}m algorithm for Kernel PCA.


Differentially Private High Dimensional Sparse Covariance Matrix Estimation

arXiv.org Machine Learning

In this paper, we study the problem of estimating the covariance matrix under differential privacy,where the underlying covariance matrix is assumed to be sparse and of high dimensions. Our approach can be easily extendedto local differential privacy. Experiments on the synthetic datasets show consistent results with our theoretical claims. Keywords: Differential privacy, sparse covariance estimation, high dimensional statistics 1. Introduction Machine Learning and Statistical Estimation have made profound impact in recent years to many applied domains such as social sciences, genomics, and medicine. During theirapplications, a frequently encountered challenge is how to deal with the high dimensionality of the datasets, especially for those in genomics, educational and psychological research.A commonly adopted strategy for dealing with such an issue is to assume that the underlying structures of parameters are sparse.


Estimation of Monge Matrices

arXiv.org Machine Learning

Monge matrices and their permuted versions known as pre-Monge matrices naturally appear in many domains across science and engineering. While the rich structural properties of such matrices have long been leveraged for algorithmic purposes, little is known about their impact on statistical estimation. In this work, we propose to view this structure as a shape constraint and study the problem of estimating a Monge matrix subject to additive random noise. More specifically, we establish the minimax rates of estimation of Monge and pre-Monge matrices. In the case of pre-Monge matrices, the minimax-optimal least-squares estimator is not efficiently computable, and we propose two efficient estimators and establish their rates of convergence. Our theoretical findings are supported by numerical experiments.