The Sample Complexity of Semi-Supervised Learning with Nonparametric Mixture Models
Dan, Chen, Leqi, Liu, Aragam, Bryon, Ravikumar, Pradeep K., Xing, Eric P.
–Neural Information Processing Systems
We study the sample complexity of semi-supervised learning (SSL) and introduce new assumptions based on the mismatch between a mixture model learned from unlabeled data and the true mixture model induced by the (unknown) class conditional distributions. Under these assumptions, we establish an $\Omega(K\log K)$ labeled sample complexity bound without imposing parametric assumptions, where $K$ is the number of classes. Our results suggest that even in nonparametric settings it is possible to learn a near-optimal classifier using only a few labeled samples. Unlike previous theoretical work which focuses on binary classification, we consider general multiclass classification ($K>2$), which requires solving a difficult permutation learning problem. This permutation defines a classifier whose classification error is controlled by the Wasserstein distance between mixing measures, and we provide finite-sample results characterizing the behaviour of the excess risk of this classifier. Finally, we describe three algorithms for computing these estimators based on a connection to bipartite graph matching, and perform experiments to illustrate the superiority of the MLE over the majority vote estimator.
Neural Information Processing Systems
Dec-31-2018
- Country:
- North America
- Canada > Quebec
- Montreal (0.04)
- United States > Pennsylvania
- Allegheny County > Pittsburgh (0.04)
- Canada > Quebec
- North America
- Genre:
- Research Report > New Finding (0.55)
- Technology: