Goto

Collaborating Authors

 positive-negative learning


Theoretical Comparisons of Positive-Unlabeled Learning against Positive-Negative Learning

Neural Information Processing Systems

In PU learning, a binary classifier is trained from positive (P) and unlabeled (U) data without negative (N) data. Although N data is missing, it sometimes outperforms PN learning (i.e., ordinary supervised learning). Hitherto, neither theoretical nor experimental analysis has been given to explain this phenomenon. In this paper, we theoretically compare PU (and NU) learning against PN learning based on the upper bounds on estimation errors. We find simple conditions when PU and NU learning are likely to outperform PN learning, and we prove that, in terms of the upper bounds, either PU or NU learning (depending on the class-prior probability and the sizes of P and N data) given infinite U data will improve on PN learning. Our theoretical findings well agree with the experimental results on artificial and benchmark data even when the experimental setup does not match the theoretical assumptions exactly.


Reviews: Theoretical Comparisons of Positive-Unlabeled Learning against Positive-Negative Learning

Neural Information Processing Systems

The basic problem studied in the paper concerns learning from data which is only partially labelled, but nonetheless doing better than with fully labelled data. In the particular scenario where one has a pool of unlabelled data in lieu of one of the classes, the paper seeks to quantify the impact this has on the estimation error of the learned classifier. Determining the degradation or lack thereof when learning from unlabelled data is interesting, and thus the paper seems well motivated. The machinery used to illustrate its messages are fairly standard -- the key quantities, namely the estimation errors for each scenario, are derived from a simple Rademacher analysis -- however, the final results appear novel, with implications worked through in various scenarios. The papers hinges on the simple facts that (a) different risks (for PN/PU/NU learning) may be seen as employing different weightings on individual risks for the positive and negative class, and (b) these ratings are reflected in appropriate terms for the estimation error.


Theoretical Comparisons of Positive-Unlabeled Learning against Positive-Negative Learning

Neural Information Processing Systems

In PU learning, a binary classifier is trained from positive (P) and unlabeled (U) data without negative (N) data. Although N data is missing, it sometimes outperforms PN learning (i.e., ordinary supervised learning). Hitherto, neither theoretical nor experimental analysis has been given to explain this phenomenon. In this paper, we theoretically compare PU (and NU) learning against PN learning based on the upper bounds on estimation errors. We find simple conditions when PU and NU learning are likely to outperform PN learning, and we prove that, in terms of the upper bounds, either PU or NU learning (depending on the class-prior probability and the sizes of P and N data) given infinite U data will improve on PN learning.