Towards a Zero-One Law for Column Subset Selection

Neural Information Processing Systems 

There are a number of approximation algorithms for NP-hard versions of low rank approximation, such as finding a rank- k matrix B minimizing the sum of absolute values of differences to a given n -by- n matrix A, \min_{\textrm{rank-}k B}\ A-B\ _1, or more generally finding a rank- k matrix B which minimizes the sum of p -th powers of absolute values of differences, \min_{\textrm{rank-}k B}\ A-B\ _p p . Many of these algorithms are linear time columns subset selection algorithms, returning a subset of \poly(k \log n) columns whose cost is no more than a \poly(k) factor larger than the cost of the best rank- k matrix. The above error measures are special cases of the following general entrywise low rank approximation problem: given an arbitrary function g:\mathbb{R} \rightarrow \mathbb{R}_{\geq 0}, find a rank- k matrix B which minimizes \ A-B\ _g \sum_{i,j}g(A_{i,j}-B_{i,j}) . A natural question is which functions g admit efficient approximation algorithms? Indeed, this is a central question of recent work studying generalized low rank models.