Unsupervised or Indirectly Supervised Learning
Review for NeurIPS paper: Learning from Positive and Unlabeled Data with Arbitrary Positive Shift
Additional Feedback: Overall comment: Although I enjoyed reading the paper and it proposes novel ideas for PU learning research, I couldn't give a high score because: I feel it is hard to compare between methods in the experiments due to the usage of different models for proposed/baselines, some of the work in this paper (Sec. Other comments: The output of logistic classifiers will be between 0 and 1, and theoretically it should be an estimate of p(y x). Practically, the estimate of p(y x) can become quite noisy, or may overfit and lead to peaky hat{p}(y x) distributions, according to papers like "On Calibration of Modern Neural Networks" (ICML 2017). Assuming \hat{\sigma}(x) p_tr(y -1 x) seems to be a strong assumption, but does this cause any issues in the experiments? A minor suggestion is to investigate confidence-calibration, and see how much sensitive the final PU classifier is for worse calibration.
Learning from Positive and Unlabeled Data with Arbitrary Positive Shift
Positive-unlabeled (PU) learning trains a binary classifier using only positive and unlabeled data. A common simplifying assumption is that the positive data is representative of the target positive class. This assumption rarely holds in practice due to temporal drift, domain shift, and/or adversarial manipulation. This paper shows that PU learning is possible even with arbitrarily non-representative positive data given unlabeled data from the source and target distributions. Our key insight is that only the negative class's distribution need be fixed. We integrate this into two statistically consistent methods to address arbitrary positive bias - one approach combines negative-unlabeled learning with unlabeled-unlabeled learning while the other uses a novel, recursive risk estimator. Experimental results demonstrate our methods' effectiveness across numerous real-world datasets and forms of positive bias, including disjoint positive class-conditional supports. Additionally, we propose a general, simplified approach to address PU risk estimation overfitting.
Review for NeurIPS paper: Learning from Positive and Unlabeled Data with Arbitrary Positive Shift
Following the author response and discussion, all reviewers had an overall positive impression of the paper, highlighting some salient features: studies an interesting and under-explored problem setting, namely, PU learning where the positive samples are from a distribution unrelated to that of the target distribution the proposed method is equipped with theoretical guarantees, and is demonstrated to perform well empirically Some areas for improvement include: - the lack of comparison against bPU. The argument of such techniques making an assumption that does not hold is fine, but how well do they perform on the tasks considered here? The authors are encouraged to incorporate these in a revised version.
Reviews: Are Labels Required for Improving Adversarial Robustness?
This paper combines the two and conducts adversarial training by applying the regularization term on unlabeled data. The theoretical and empirical analysis are new. However, some crucial points are not well addressed. It is observed that the proposed method (UAT) behaves differently on CIFAR10 and SVHN (Fig.1), wrt m. On CIFAR10, it outperforms others starting from m 4k.
Are Labels Required for Improving Adversarial Robustness?
Jean-Baptiste Alayrac, Jonathan Uesato, Po-Sen Huang, Alhussein Fawzi, Robert Stanforth, Pushmeet Kohli
Recent work has uncovered the interesting (and somewhat surprising) finding that training models to be invariant to adversarial perturbations requires substantially larger datasets than those required for standard classification. This result is a key hurdle in the deployment of robust machine learning models in many real world applications where labeled data is expensive. Our main insight is that unlabeled data can be a competitive alternative to labeled data for training adversarially robust models. Theoretically, we show that in a simple statistical setting, the sample complexity for learning an adversarially robust model from unlabeled data matches the fully supervised case up to constant factors. On standard datasets like CIFAR-10, a simple Unsupervised Adversarial Training (UAT) approach using unlabeled data improves robust accuracy by 21.7% over using 4K supervised examples alone, and captures over 95% of the improvement from the same number of labeled examples. Finally, we report an improvement of 4% over the previous state-of-theart on CIFAR-10 against the strongest known attack by using additional unlabeled data from the uncurated 80 Million Tiny Images dataset. This demonstrates that our finding extends as well to the more realistic case where unlabeled data is also uncurated, therefore opening a new avenue for improving adversarial training.
Review for NeurIPS paper: Uncertainty Aware Semi-Supervised Learning on Graph Data
Clarity: Overall the paper is very clear. The authors did an excellent job. Equation 5 - I am confused on a few things. The notation P(y x; theta) is confusing because the semicolon implies that theta is a vector and not a random vector, however, the conditional distribution of theta is given P(theta G). So what is the point of the semicolon? Also, there is a typo in Equation 5 I think because the entropy term is not defined correctly.
Review for NeurIPS paper: Uncertainty Aware Semi-Supervised Learning on Graph Data
R#2 and R#3 generally liked the paper. R#1 has a brief review that raised concern on novelty of the method. The rebuttal well addressed the concerns and made all reviewers increase their score. We have collected comments from an additional reviewer, who pointed out more issues on writing and the theoretical results (see blew). We advise the authors to take efforts to address these issues in the revision.
Reviews: Leveraging Labeled and Unlabeled Data for Consistent Fair Binary Classification
Originality: -Although the Hardt paper has suggested the use of this approach, the paper claims that it's the first to actually show this is indeed possible. Quality: -The assumptions made about the model are very well justified. The discussion after each assumption provided the context as to why the assumption makes sense and why the assumption is needed to study their model. These discussion as a result provided very good intuition and set up the stage for the proof. Clarity: -Overall, the paper have a very smooth flow, whether it be discussion of their assumptions or their remarks.
Reviews: Leveraging Labeled and Unlabeled Data for Consistent Fair Binary Classification
Three reviewers who are all good experts for this paper found the paper interesting, novel, compelling, and well-written. With such a difficult topic as fairness, it was particularly helpful that the authors were able to discuss their assumptions, results, and proofs so clearly, and that definitely adds value to the work. The authors' response was appreciated and was found to be helpful, but reviewers expressed some concern in discussion about adding too many new results they didn't have a chance to review, so while we hope the authors can address some of the reviewers suggestions in the final paper, they are encouraged not to add too much stuff that wasn't reviewed, but instead to consider expanding on some of that for a follow-on submission.
comments on their remarks and questions. combination of the guess loss with the additive noise beats the out-of-the-box CycleGAN on the GTA dataset in terms
We cannot thank the reviewers enough for their valuable feedback on our work. Reviewers 1 and 2: Combine guess loss with additive noise. Most recent advances in adversarial defense methods address "black-box attacks" performed by a The latter incorporates adversarial examples during training to increase the model's robustness to the attack. Therefore the reconstructed image can serve as an adversarially perturbed example of the non-adversarial input image. Reviewer 3: Novelty is not enough as most of the proposed solution or observations are already published.