Goto

Collaborating Authors

 unsupervised data augmentation


Unsupervised Data Augmentation for Consistency Training

Neural Information Processing Systems

Back-translationGiven the low budget and production limitations, this movie is very good.Since it was highly limited in terms of budget, and the production restrictions, the film was cheerful.There are few budget items and production limitations to make this film a really good one.Due to the small dollar amount and production limitations the ouestfilm is very beautiful.Rand Augment


Unsupervised Data Augmentation for Consistency Training

Neural Information Processing Systems

Semi-supervised learning lately has shown much promise in improving deep learning models when labeled data is scarce. Common among recent approaches is the use of consistency training on a large amount of unlabeled data to constrain model predictions to be invariant to input noise. In this work, we present a new perspective on how to effectively noise unlabeled examples and argue that the quality of noising, specifically those produced by advanced data augmentation methods, plays a crucial role in semi-supervised learning. By substituting simple noising operations with advanced data augmentation methods such as RandAugment and back-translation, our method brings substantial improvements across six language and three vision tasks under the same consistency training framework. On the IMDb text classification dataset, with only 20 labeled examples, our method achieves an error rate of 4.20, outperforming the state-of-the-art model trained on 25,000 labeled examples. On a standard semi-supervised learning benchmark, CIFAR-10, our method outperforms all previous approaches and achieves an error rate of 5.43 with only 250 examples. Our method also combines well with transfer learning, e.g., when finetuning from BERT, and yields improvements in high-data regime, such as ImageNet, whether when there is only 10% labeled data or when a full labeled set with 1.3M extra unlabeled examples is used.


Unsupervised Data Augmentation for Consistency Training

Neural Information Processing Systems

Semi-supervised learning lately has shown much promise in improving deep learning models when labeled data is scarce. Common among recent approaches is the use of consistency training on a large amount of unlabeled data to constrain model predictions to be invariant to input noise.


Unsupervised Data Augmentation for Consistency Training

Neural Information Processing Systems

Semi-supervised learning lately has shown much promise in improving deep learning models when labeled data is scarce. Common among recent approaches is the use of consistency training on a large amount of unlabeled data to constrain model predictions to be invariant to input noise. In this work, we present a new perspective on how to effectively noise unlabeled examples and argue that the quality of noising, specifically those produced by advanced data augmentation methods, plays a crucial role in semi-supervised learning. By substituting simple noising operations with advanced data augmentation methods such as RandAugment and back-translation, our method brings substantial improvements across six language and three vision tasks under the same consistency training framework. On the IMDb text classification dataset, with only 20 labeled examples, our method achieves an error rate of 4.20, outperforming the state-of-the-art model trained on 25,000 labeled examples.


Review for NeurIPS paper: Unsupervised Data Augmentation for Consistency Training

Neural Information Processing Systems

Additional Feedback: The main comment I have regarding the paper is that the authors do not provide adequate justification as to why the advanced data augmentation work compared to the simple ones and when to apply them. This same intuition can be applied for other semi-supervised methods like nearest neighbor and label propagation. These methods will assign the same labels to unlabeled data examples within its component in a graph. This is intuitive but does not explain why the noise from the advanced data augmentation methods are better for semi-supervised learning or provide guarantees for when they work. I acknowledge that I read the rebuttal and thank the authors for providing explanations to the questions and concerns I had.


Unsupervised Data Augmentation for Consistency Training

Neural Information Processing Systems

Semi-supervised learning lately has shown much promise in improving deep learning models when labeled data is scarce. Common among recent approaches is the use of consistency training on a large amount of unlabeled data to constrain model predictions to be invariant to input noise. In this work, we present a new perspective on how to effectively noise unlabeled examples and argue that the quality of noising, specifically those produced by advanced data augmentation methods, plays a crucial role in semi-supervised learning. By substituting simple noising operations with advanced data augmentation methods such as RandAugment and back-translation, our method brings substantial improvements across six language and three vision tasks under the same consistency training framework. On the IMDb text classification dataset, with only 20 labeled examples, our method achieves an error rate of 4.20, outperforming the state-of-the-art model trained on 25,000 labeled examples.


Review -- UDA: Unsupervised Data Augmentation for Consistency Training

#artificialintelligence

This validates the idea of stronger data augmentations found in supervised learning can always lead to more gains when applied to the semi-supervised learning settings. First, UDA consistently outperforms the two baselines given different sizes of labeled data. Moreover, the performance difference between UDA and VAT shows the superiority of data augmentation based noise. Given the same architecture, UDA outperforms all published results by significant margins and nearly matches the fully supervised performance, which uses 10 more labeled examples. First, even with very few labeled examples, UDA can offer decent or even competitive performances compared to the SOTA model trained with full supervised data.