List-Decodable Subspace Recovery via Sum-of-Squares

Feb-12-2020–arXiv.org Machine Learning

An influential recent line of work [KLS09, ABL13, DKK 16, LRV16, CSV17, KS17a, KS17b, HL17, DKK 18, DKS17a, KKM18] has focused on designing algorithms for basic statistical estimation tasks in the presence of adversarial outliers. This has led to a body of work on outlier-robust estimation of basic parameters of distributions such as mean, covariance [DKK 16, DKS17b, CDG19, DKK 17, DKK 18, CDGW19] and moment tensors [KS17b] along with applications to "downstream" learning tasks such as linear and polynomial regression [DKS17c, KKM18, DKK 19, PSBR18]. The upshot of this line of work is a detailed understanding of efficient robust estimation when the fraction of inliers ( 1/2), but a fixed fraction of arbitrary adversarial outliers in the input data. In this work, we focus on the harsher list-decodable estimation model where the fraction of inliersαis 1/2 - i.e.,a majority of the input sample are outliers. First considered in [BBV08] in the context of clustering, this was proposed as a model for untrusted data in a recent influential recent work of Charikar, Steinhardt and Valiant [CSV17]. Since unique recovery is informationtheoretically impossible in this setting, the goal is to recover a small (ideally O(1/α)) size list of parameters one of which is guaranteed to be close to those of the inlier distribution. A recent series of works have resulted in a high-level blueprint based on the sum-of-squares method for listdecodable estimation yielding algorithms for list-decodable mean estimation [DKS18, KS17a] and linear regression [KKK19, RY20]. We extend this line of work by giving the first efficient algorithm for list-decodable subspace recovery. In this setting, we are given data with α fraction inliers generated i.i.d.

algorithm, artificial intelligence, survey article, (18 more...)

arXiv.org Machine Learning

Feb-12-2020

arXiv.org PDF

Add feedback

Country:
- Asia > Afghanistan
  - Parwan Province > Charikar (0.24)
- North America > United States
  - California (0.46)

Genre:
- Research Report (0.64)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Statistical Learning (0.87)
    - Representation & Reasoning (1.00)
  - Data Science (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found