Bridging Convex and Nonconvex Optimization in Robust PCA: Noise, Outliers, and Missing Data
Chen, Yuxin, Fan, Jianqing, Ma, Cong, Yan, Yuling
The imperfectness of data acquisition processes, however, presents several common yet critical challenges: (1) random noise: which reflects the uncertainty of the environment and/or the measurement processes; (2) outliers: which represent a sort of corruption that exhibits abnormal behavior; and (3) incomplete data, namely, we might only get to observe a fraction of the matrix entries. Low-rank matrix estimation algorithms aimed at addressing these challenges have been extensively studied under the umbrella of robust principal component analysis (Robust PCA) [CSPW11, CLMW11], a terminology popularized by the seminal work [CLMW11]. To formulate the above-mentioned problem in a more precise manner, imagine that we seek to estimate an unknown low-rank matrix L null R n n . 1 What we can obtain is a collection of partially observed and corrupted entries as follows M ij L null ij S null ij E ij, (i,j) Ω obs, (1.1) where S null [ S null ij] 1 i,j n is a matrix consisting of outliers, E [ E ij] 1 i,j n represents the random noise, and we only observe entries over an index subset Ω obs [n ] [n ] with [n ]: {1, 2, ···,n }. The current paper assumes that S null is a relatively sparse matrix whose nonzero entries might have arbitrary magnitudes. This assumption has been commonly adopted in prior work to model gross outliers, while enabling reliable disentanglement of the outlier component and the low-rank component [CSPW11,CLMW11,CJSC13,Li13]. In addition, we suppose that the entries {E ij} are independent zero-mean sub-Gaussian random variables, as commonly assumed in the statistics literature to model a large family of random noise. The aim is to reliably estimate L null given the grossly corrupted and possibly incomplete data (1.1). Ideally, this task should be accomplished without knowing the locations and magnitudes of the outliers S null . 1 To avoid cluttered notation, this paper works with square matrices of size n by n. Our results and analysis can be extended to accommodate rectangular matrices.
Jan-15-2020
- Country:
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- Genre:
- Research Report > New Finding (0.34)
- Technology: