List-Decodable Linear Regression
Karmalkar, Sushrut, Klivans, Adam R., Kothari, Pravesh K.
We give the first polynomial-time algorithm for robust regression in the list-decodable setting where an adversary can corrupt a greater than $1/2$ fraction of examples. For any $\alpha < 1$, our algorithm takes as input a sample $\{(x_i,y_i)\}_{i \leq n}$ of $n$ linear equations where $\alpha n$ of the equations satisfy $y_i = \langle x_i,\ell^*\rangle +\zeta$ for some small noise $\zeta$ and $(1-\alpha)n$ of the equations are {\em arbitrarily} chosen. It outputs a list $L$ of size $O(1/\alpha)$ - a fixed constant - that contains an $\ell$ that is close to $\ell^*$. Our algorithm succeeds whenever the inliers are chosen from a \emph{certifiably} anti-concentrated distribution $D$. In particular, this gives a $(d/\alpha)^{O(1/\alpha^8)}$ time algorithm to find a $O(1/\alpha)$ size list when the inlier distribution is standard Gaussian. For discrete product distributions that are anti-concentrated only in \emph{regular} directions, we give an algorithm that achieves similar guarantee under the promise that $\ell^*$ has all coordinates of the same magnitude. To complement our result, we prove that the anti-concentration assumption on the inliers is information-theoretically necessary. Our algorithm is based on a new framework for list-decodable learning that strengthens the `identifiability to algorithms' paradigm based on the sum-of-squares method. In an independent and concurrent work, Raghavendra and Yau also used the Sum-of-Squares method to give a similar result for list-decodable regression.
May-30-2019
- Country:
- North America > United States
- New York (0.04)
- Texas > Travis County
- Austin (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Georgia > Fulton County
- Atlanta (0.04)
- California
- San Diego County > San Diego (0.04)
- Los Angeles County
- Los Angeles (0.14)
- Long Beach (0.04)
- Europe
- Sweden > Stockholm
- Stockholm (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Netherlands > South Holland
- Dordrecht (0.04)
- Sweden > Stockholm
- Asia
- Middle East > Jordan (0.04)
- Afghanistan > Parwan Province
- Charikar (0.04)
- North America > United States
- Genre:
- Research Report (0.84)
- Technology: