Goto

Collaborating Authors

 Unsupervised or Indirectly Supervised Learning


Reviews: Are Labels Required for Improving Adversarial Robustness?

Neural Information Processing Systems

This paper combines the two and conducts adversarial training by applying the regularization term on unlabeled data. The theoretical and empirical analysis are new. However, some crucial points are not well addressed. It is observed that the proposed method (UAT) behaves differently on CIFAR10 and SVHN (Fig.1), wrt m. On CIFAR10, it outperforms others starting from m 4k.


Are Labels Required for Improving Adversarial Robustness?

Neural Information Processing Systems

Recent work has uncovered the interesting (and somewhat surprising) finding that training models to be invariant to adversarial perturbations requires substantially larger datasets than those required for standard classification. This result is a key hurdle in the deployment of robust machine learning models in many real world applications where labeled data is expensive. Our main insight is that unlabeled data can be a competitive alternative to labeled data for training adversarially robust models. Theoretically, we show that in a simple statistical setting, the sample complexity for learning an adversarially robust model from unlabeled data matches the fully supervised case up to constant factors. On standard datasets like CIFAR-10, a simple Unsupervised Adversarial Training (UAT) approach using unlabeled data improves robust accuracy by 21.7% over using 4K supervised examples alone, and captures over 95% of the improvement from the same number of labeled examples. Finally, we report an improvement of 4% over the previous state-of-theart on CIFAR-10 against the strongest known attack by using additional unlabeled data from the uncurated 80 Million Tiny Images dataset. This demonstrates that our finding extends as well to the more realistic case where unlabeled data is also uncurated, therefore opening a new avenue for improving adversarial training.


Review for NeurIPS paper: Uncertainty Aware Semi-Supervised Learning on Graph Data

Neural Information Processing Systems

Clarity: Overall the paper is very clear. The authors did an excellent job. Equation 5 - I am confused on a few things. The notation P(y x; theta) is confusing because the semicolon implies that theta is a vector and not a random vector, however, the conditional distribution of theta is given P(theta G). So what is the point of the semicolon? Also, there is a typo in Equation 5 I think because the entropy term is not defined correctly.


Review for NeurIPS paper: Uncertainty Aware Semi-Supervised Learning on Graph Data

Neural Information Processing Systems

R#2 and R#3 generally liked the paper. R#1 has a brief review that raised concern on novelty of the method. The rebuttal well addressed the concerns and made all reviewers increase their score. We have collected comments from an additional reviewer, who pointed out more issues on writing and the theoretical results (see blew). We advise the authors to take efforts to address these issues in the revision.


Not All Out-of-Distribution Data Are Harmful to Open-Set Active Learning Yang Yang Nanjing University of Science and Technology

Neural Information Processing Systems

Active learning (AL) methods have been proven to be an effective way to reduce the labeling effort by intelligently selecting valuable instances for annotation. Despite their great success with in-distribution (ID) scenarios, AL methods suffer from performance degradation in many real-world applications because out-of-distribution (OOD) instances are always inevitably contained in unlabeled data, which may lead to inefficient sampling. Therefore, several attempts have been explored open-set AL by strategically selecting pure ID instances while filtering OOD instances. However, concentrating solely on selecting pseudo-ID instances may cause the training constraint of the ID classifier and OOD detector. To address this issue, we propose a simple yet effective sampling scheme, Progressive Active Learning (PAL), which employs a progressive sampling mechanism to leverage the active selection of valuable OOD instances. The proposed PAL measures unlabeled instances by synergistically evaluating instances' informativeness and representativeness, and thus it can balance the pseudo-ID and pseudo-OOD instances in each round to enhance both the capacity of the ID classifier and the OOD detector. Extensive experiments on various open-set AL scenarios demonstrate the effectiveness of the proposed PAL, compared with the state-of-the-art methods. The code is available at https://github.com/njustkmg/PAL.


Not All Out-of-Distribution Data Are Harmful to Open-Set Active Learning Yang Yang Nanjing University of Science and Technology

Neural Information Processing Systems

Active learning (AL) methods have been proven to be an effective way to reduce the labeling effort by intelligently selecting valuable instances for annotation. Despite their great success with in-distribution (ID) scenarios, AL methods suffer from performance degradation in many real-world applications because out-of-distribution (OOD) instances are always inevitably contained in unlabeled data, which may lead to inefficient sampling. Therefore, several attempts have been explored open-set AL by strategically selecting pure ID instances while filtering OOD instances. However, concentrating solely on selecting pseudo-ID instances may cause the training constraint of the ID classifier and OOD detector. To address this issue, we propose a simple yet effective sampling scheme, Progressive Active Learning (PAL), which employs a progressive sampling mechanism to leverage the active selection of valuable OOD instances. The proposed PAL measures unlabeled instances by synergistically evaluating instances' informativeness and representativeness, and thus it can balance the pseudo-ID and pseudo-OOD instances in each round to enhance both the capacity of the ID classifier and the OOD detector. Extensive experiments on various open-set AL scenarios demonstrate the effectiveness of the proposed PAL, compared with the state-of-the-art methods. The code is available at https://github.com/njustkmg/PAL.


Reviews: Leveraging Labeled and Unlabeled Data for Consistent Fair Binary Classification

Neural Information Processing Systems

Originality: -Although the Hardt paper has suggested the use of this approach, the paper claims that it's the first to actually show this is indeed possible. Quality: -The assumptions made about the model are very well justified. The discussion after each assumption provided the context as to why the assumption makes sense and why the assumption is needed to study their model. These discussion as a result provided very good intuition and set up the stage for the proof. Clarity: -Overall, the paper have a very smooth flow, whether it be discussion of their assumptions or their remarks.


Reviews: Leveraging Labeled and Unlabeled Data for Consistent Fair Binary Classification

Neural Information Processing Systems

Three reviewers who are all good experts for this paper found the paper interesting, novel, compelling, and well-written. With such a difficult topic as fairness, it was particularly helpful that the authors were able to discuss their assumptions, results, and proofs so clearly, and that definitely adds value to the work. The authors' response was appreciated and was found to be helpful, but reviewers expressed some concern in discussion about adding too many new results they didn't have a chance to review, so while we hope the authors can address some of the reviewers suggestions in the final paper, they are encouraged not to add too much stuff that wasn't reviewed, but instead to consider expanding on some of that for a follow-on submission.


comments on their remarks and questions. combination of the guess loss with the additive noise beats the out-of-the-box CycleGAN on the GTA dataset in terms

Neural Information Processing Systems

We cannot thank the reviewers enough for their valuable feedback on our work. Reviewers 1 and 2: Combine guess loss with additive noise. Most recent advances in adversarial defense methods address "black-box attacks" performed by a The latter incorporates adversarial examples during training to increase the model's robustness to the attack. Therefore the reconstructed image can serve as an adversarially perturbed example of the non-adversarial input image. Reviewer 3: Novelty is not enough as most of the proposed solution or observations are already published.


Contrastive learning of global and local features for medical image segmentation with limited annotations

Neural Information Processing Systems

A key requirement for the success of supervised deep learning is a large labeled dataset - a condition that is difficult to meet in medical image analysis. Selfsupervised learning (SSL) can help in this regard by providing a strategy to pre-train a neural network with unlabeled data, followed by fine-tuning for a downstream task with limited annotations. Contrastive learning, a particular variant of SSL, is a powerful technique for learning image-level representations. In this work, we propose strategies for extending the contrastive learning framework for segmentation of volumetric medical images in the semi-supervised setting with limited annotations, by leveraging domain-specific and problem-specific cues. Specifically, we propose (1) novel contrasting strategies that leverage structural similarity across volumetric medical images (domain-specific cue) and (2) a local version of the contrastive loss to learn distinctive representations of local regions that are useful for per-pixel segmentation (problem-specific cue). We carry out an extensive evaluation on three Magnetic Resonance Imaging (MRI) datasets. In the limited annotation setting, the proposed method yields substantial improvements compared to other self-supervision and semi-supervised learning techniques. When combined with a simple data augmentation technique, the proposed method reaches within 8% of benchmark performance using only two labeled MRI volumes for training, corresponding to only 4% (for ACDC) of the training data used to train the benchmark.