Nearly tight sample complexity bounds for learning mixtures of Gaussians via sample compression schemes
Ashtiani, Hassan, Ben-David, Shai, Harvey, Nicholas, Liaw, Christopher, Mehrabian, Abbas, Plan, Yaniv
–Neural Information Processing Systems
We prove that ϴ(k d^2 / ε^2) samples are necessary and sufficient for learning a mixture of k Gaussians in R^d, up to error ε in total variation distance. This improves both the known upper bounds and lower bounds for this problem. For mixtures of axis-aligned Gaussians, we show that O(k d / ε^2) samples suffice, matching a known lower bound. The upper bound is based on a novel technique for distribution learning based on a notion of sample compression. Any class of distributions that allows such a sample compression scheme can also be learned with few samples. Moreover, if a class of distributions has such a compression scheme, then so do the classes of products and mixtures of those distributions. The core of our main result is showing that the class of Gaussians in R^d has an efficient sample compression.
Neural Information Processing Systems
Dec-31-2018
- Country:
- Europe
- Netherlands > South Holland
- Leiden (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Netherlands > South Holland
- North America
- Canada
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.14)
- Ontario
- Hamilton (0.04)
- Waterloo Region > Waterloo (0.14)
- Quebec > Montreal (0.14)
- British Columbia > Metro Vancouver Regional District
- United States
- California > Santa Cruz County
- Santa Cruz (0.04)
- Massachusetts > Suffolk County
- Boston (0.04)
- New York > New York County
- New York City (0.04)
- California > Santa Cruz County
- Canada
- Europe
- Genre:
- Research Report (0.48)
- Technology: