Goto

Collaborating Authors

 labeled and unlabeled example


The Value of Labeled and Unlabeled Examples when the Model is Imperfect

Neural Information Processing Systems

Semi-supervised learning, i.e. learning from both labeled and unlabeled data has received signi(cid:2)cant attention in the machine learning literature in recent years. Still our understanding of the theoretical foundations of the usefulness of unla- beled data remains somewhat limited. The simplest and the best understood sit- uation is when the data is described by an identi(cid:2)able mixture model, and where each class comes from a pure component. This natural setup and its implications ware analyzed in [11, 5]. One important result was that in certain regimes, labeled data becomes exponentially more valuable than unlabeled data. However, in most realistic situations, one would not expect that the data comes from a parametric mixture distribution with identi(cid:2)able components.


The Value of Labeled and Unlabeled Examples when the Model is Imperfect

Neural Information Processing Systems

Semi-supervised learning, i.e. learning from both labeled and unlabeled data has received significant attention in the machine learning literature in recent years. Still our understanding of the theoretical foundations of the usefulness of unlabeled data remains somewhat limited. The simplest and the best understood situation is when the data is described by an identifiable mixture model, and where each class comes from a pure component. This natural setup and its implications ware analyzed in [11, 5]. One important result was that in certain regimes, labeled data becomes exponentially more valuable than unlabeled data. However, in most realistic situations, one would not expect that the data comes from a parametric mixture distribution with identifiable components.


The Value of Labeled and Unlabeled Examples when the Model is Imperfect

Neural Information Processing Systems

Semi-supervised learning, i.e. learning from both labeled and unlabeled data has received significant attention in the machine learning literature in recent years. Still our understanding of the theoretical foundations of the usefulness of unlabeled data remains somewhat limited. The simplest and the best understood situation is when the data is described by an identifiable mixture model, and where each class comes from a pure component. This natural setup and its implications ware analyzed in [11, 5]. One important result was that in certain regimes, labeled data becomes exponentially more valuable than unlabeled data. However, in most realistic situations, one would not expect that the data comes from a parametric mixture distribution with identifiable components.