Provable Deterministic Leverage Score Sampling
Papailiopoulos, Dimitris, Kyrillidis, Anastasios, Boutsidis, Christos
We explain theoretically a curious empirical phenomenon: "Approximating a matrix by deterministically selecting a subset of its columns with the corresponding largest leverage scores results in a good low-rank matrix surrogate". To obtain provable guarantees, previous work requires randomized sampling of the columns with probabilities proportional to their leverage scores. In this work, we provide a novel theoretical analysis of deterministic leverage score sampling. We show that such deterministic sampling can be provably as accurate as its randomized counterparts, if the leverage scores follow a moderately steep power-law decay. We support this power-law assumption by providing empirical evidence that such decay laws are abundant in real-world data sets. We then demonstrate empirically the performance of deterministic leverage score sampling, which many times matches or outperforms the state-of-the-art techniques.
Jun-2-2014
- Country:
- North America > United States (0.14)
- Genre:
- Research Report > Promising Solution (0.34)
- Industry:
- Health & Medicine (0.46)
- Technology:
- Information Technology
- Artificial Intelligence > Machine Learning (0.93)
- Communications (0.68)
- Data Science (0.68)
- Information Technology