Learning Structured Distributions From Untrusted Batches: Faster and Simpler

Feb-24-2020–arXiv.org Machine Learning

We revisit the problem of learning from untrusted batches introduced by Qiao and Valiant [QV17]. Recently, Jain and Orlitsky [JO19] gave a simple semidefinite programming approach based on the cut-norm that achieves essentially information-theoretically optimal error in polynomial time. Concurrently, Chen et al. [CLM19] considered a variant of the problem where $\mu$ is assumed to be structured, e.g. log-concave, monotone hazard rate, $t$-modal, etc. In this case, it is possible to achieve the same error with sample complexity sublinear in $n$, and they exhibited a quasi-polynomial time algorithm for doing so using Haar wavelets. In this paper, we find an appealing way to synthesize the techniques of [JO19] and [CLM19] to give the best of both worlds: an algorithm which runs in polynomial time and can exploit structure in the underlying distribution to achieve sublinear sample complexity. Along the way, we simplify the approach of [JO19] by avoiding the need for SDP rounding and giving a more direct interpretation of it through the lens of soft filtering, a powerful recent technique in high-dimensional robust estimation.

batch, lemma 3, step follow, (13 more...)

arXiv.org Machine Learning

Feb-24-2020

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York (0.04)
  - Massachusetts > Middlesex County
    - Cambridge (0.04)
- Asia > Afghanistan
  - Parwan Province > Charikar (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology
  - Artificial Intelligence > Machine Learning (1.00)
  - Data Science (0.92)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found