arbitrary positive shift
Learning from Positive and Unlabeled Data with Arbitrary Positive Shift
Positive-unlabeled (PU) learning trains a binary classifier using only positive and unlabeled data. A common simplifying assumption is that the positive data is representative of the target positive class. This assumption rarely holds in practice due to temporal drift, domain shift, and/or adversarial manipulation. This paper shows that PU learning is possible even with arbitrarily non-representative positive data given unlabeled data from the source and target distributions. Our key insight is that only the negative class's distribution need be fixed. We integrate this into two statistically consistent methods to address arbitrary positive bias - one approach combines negative-unlabeled learning with unlabeled-unlabeled learning while the other uses a novel, recursive risk estimator. Experimental results demonstrate our methods' effectiveness across numerous real-world datasets and forms of positive bias, including disjoint positive class-conditional supports. Additionally, we propose a general, simplified approach to address PU risk estimation overfitting.
Review for NeurIPS paper: Learning from Positive and Unlabeled Data with Arbitrary Positive Shift
Additional Feedback: Overall comment: Although I enjoyed reading the paper and it proposes novel ideas for PU learning research, I couldn't give a high score because: I feel it is hard to compare between methods in the experiments due to the usage of different models for proposed/baselines, some of the work in this paper (Sec. Other comments: The output of logistic classifiers will be between 0 and 1, and theoretically it should be an estimate of p(y x). Practically, the estimate of p(y x) can become quite noisy, or may overfit and lead to peaky hat{p}(y x) distributions, according to papers like "On Calibration of Modern Neural Networks" (ICML 2017). Assuming \hat{\sigma}(x) p_tr(y -1 x) seems to be a strong assumption, but does this cause any issues in the experiments? A minor suggestion is to investigate confidence-calibration, and see how much sensitive the final PU classifier is for worse calibration.
Review for NeurIPS paper: Learning from Positive and Unlabeled Data with Arbitrary Positive Shift
Following the author response and discussion, all reviewers had an overall positive impression of the paper, highlighting some salient features: studies an interesting and under-explored problem setting, namely, PU learning where the positive samples are from a distribution unrelated to that of the target distribution the proposed method is equipped with theoretical guarantees, and is demonstrated to perform well empirically Some areas for improvement include: - the lack of comparison against bPU. The argument of such techniques making an assumption that does not hold is fine, but how well do they perform on the tasks considered here? The authors are encouraged to incorporate these in a revised version.
Learning from Positive and Unlabeled Data with Arbitrary Positive Shift
Positive-unlabeled (PU) learning trains a binary classifier using only positive and unlabeled data. A common simplifying assumption is that the positive data is representative of the target positive class. This assumption rarely holds in practice due to temporal drift, domain shift, and/or adversarial manipulation. This paper shows that PU learning is possible even with arbitrarily non-representative positive data given unlabeled data from the source and target distributions. Our key insight is that only the negative class's distribution need be fixed.