Zhang, Huishuai, Chi, Yuejie, Liang, Yingbin

This paper investigates the phase retrieval problem, which aims to recover a signal from the magnitudes of its linear measurements. We develop statistically and computationally efficient algorithms for the situation when the measurements are corrupted by sparse outliers that can take arbitrary values. We propose a novel approach to robustify the gradient descent algorithm by using the sample median as a guide for pruning spurious samples in initialization and local search. Adopting the Poisson loss and the reshaped quadratic loss respectively, we obtain two algorithms termed median-TWF and median-RWF, both of which provably recover the signal from a near-optimal number of measurements when the measurement vectors are composed of i.i.d. Gaussian entries, up to a logarithmic factor, even when a constant fraction of the measurements are adversarially corrupted. We further show that both algorithms are stable in the presence of additional dense bounded noise. Our analysis is accomplished by developing non-trivial concentration results of median-related quantities, which may be of independent interest. We provide numerical experiments to demonstrate the effectiveness of our approach.

Tong, Tian, Ma, Cong, Chi, Yuejie

Many problems in data science can be treated as estimating a low-rank matrix from highly incomplete, sometimes even corrupted, observations. One popular approach is to resort to matrix factorization, where the low-rank matrix factors are optimized via first-order methods over a smooth loss function, such as the residual sum of squares. While tremendous progresses have been made in recent years, the natural smooth formulation suffers from two sources of ill-conditioning, where the iteration complexity of gradient descent scales poorly both with the dimension as well as the condition number of the low-rank matrix. Moreover, the smooth formulation is not robust to corruptions. In this paper, we propose scaled subgradient methods to minimize a family of nonsmooth and nonconvex formulations -- in particular, the residual sum of absolute errors -- which is guaranteed to converge at a fast rate that is almost dimension-free and independent of the condition number, even in the presence of corruptions. We illustrate the effectiveness of our approach when the observation operator satisfies certain mixed-norm restricted isometry properties, and derive state-of-the-art performance guarantees for a variety of problems such as robust low-rank matrix sensing and quadratic sampling.

Chi, Yuejie, Lu, Yue M., Chen, Yuxin

Substantial progress has been made recently on developing provably accurate and efficient algorithms for low-rank matrix factorization via nonconvex optimization. While conventional wisdom often takes a dim view of nonconvex optimization algorithms due to their susceptibility to spurious local minima, simple iterative methods such as gradient descent have been remarkably successful in practice. The theoretical footings, however, had been largely lacking until recently. In this tutorial-style overview, we highlight the important role of statistical models in enabling efficient nonconvex optimization with performance guarantees. We review two contrasting approaches: (1) two-stage algorithms, which consist of a tailored initialization step followed by successive refinement; and (2) global landscape analysis and initialization-free algorithms. Several canonical matrix factorization problems are discussed, including but not limited to matrix sensing, phase retrieval, matrix completion, blind deconvolution, robust principal component analysis, phase synchronization, and joint alignment. Special care is taken to illustrate the key technical insights underlying their analyses. This article serves as a testament that the integrated thinking of optimization and statistics leads to fruitful research findings.

Bhojanapalli, Srinadh, Neyshabur, Behnam, Srebro, Nati

We show that there are no spurious local minima in the non-convex factorized parametrization of low-rank matrix recovery from incoherent linear measurements. With noisy measurements we show all local minima are very close to a global optimum. Together with a curvature bound at saddle points, this yields a polynomial time global convergence guarantee for stochastic gradient descent {\em from random initialization}.

Roddenberry, T. Mitchell, Segarra, Santiago, Kyrillidis, Anastasios

We study the role of the constraint set in determining the solution to low-rank, positive semidefinite (PSD) matrix sensing problems. The setting we consider involves rank-one sensing matrices: In particular, given a set of rank-one projections of an approximately low-rank PSD matrix, we characterize the radius of the set of PSD matrices that satisfy the measurements. This result yields a sampling rate to guarantee singleton solution sets when the true matrix is exactly low-rank, such that the choice of the objective function or the algorithm to be used is inconsequential in its recovery. We discuss applications of this contribution and compare it to recent literature regarding implicit regularization for similar problems. We demonstrate practical implications of this result by applying conic projection methods for PSD matrix recovery without incorporating low-rank regularization.