Unsupervised or Indirectly Supervised Learning
Reviews: Temporal Coherency based Criteria for Predicting Video Frames using Deep Multi-stage Generative Adversarial Networks
This method provides 2 contributions for next frame prediction from video sequences. The first is the introduction of a normalized cross correlation loss, which provide a better similarity score to judge if the predicted frame is close to the true future. The second is the pairwise contrastive divergence loss, based on the idea of similarity of the image features. Results are presented on the UCF101 and Kitti datasets, and a numerical comparison using image similarity metrics (PSNR, SSIM) with Mathieu et al ICLR16 is performed. Comments: The newly proposed losses are interesting, but I suspect a problem in the evaluation.
Reviews: ALICE: Towards Understanding Adversarial Learning for Joint Distribution Matching
Adversarial Feature Learning) is an interesting extension to GANs, which can be used to train a generative model by learning generator G(z) and inference E(x) functions, where G(z) maps samples from a latent space to data and E(x) is an inference model mapping observed data to the latent space. This model is trained adversarially by jointly training E(x) and G(z) with a discriminator D(x,z) which is trained to distinguish between real (E(x), x) samples and fake (z, G(z)) samples. This is an interesting approach and has been shown to generate latent representations which are useful for semi-supervised learning. The authors highlight an issue with the ALI model, by constructing a small example for which there exist optimal solutions to the ALI loss function which have poor reconstruction, i.e. G(E(x)) can be very different to x.
Reviews: Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language
After rebuttal comments: * readability: I trust the authors to update the paper based on my suggestions (as they agreed to in their rebuttal). For AttrGAN, they did change the weight sweep and for SISGAN they used the same hyperparameters as they used in their method (which I would object to in general, but given that the authors took most of their hyperparameters from DCGAN, does not create an unfair advantage). I expect the additional details of the experimental results to be added in the paper (as supplementary material). Ensure that content that is not relevant to the text does not change. Method: to avoid changing too much of the image, use local discriminators that learn the presence of individual visual attributes.
Reviews: Realistic Evaluation of Deep Semi-Supervised Learning Algorithms
This paper proposes a systematic evaluation of SSL methods, studies the pitfalls of current approaches to evaluation, and, conducts experiments to show the impact of rigorous validation on kinds of conclusions we can draw from these methods. I really like the paper and read it when it appeared on arXiv back in April. In many places we are lacking these kind of systematic approaches to robust evaluations and it's refreshing to see more of these papers emerging that question the foundation of our validation methodologies and provide a coherent evaluation. Suggestions for improvements: - The paper mainly deals with two image categorisation datasets. While these methods have been studied in many recent SSL papers, they also have their own limitations, some of which is mentioned in the paper. But the main problem is that it restricts them to a single domain which is image categorisation.
Reviews: Stabilizing Training of Generative Adversarial Networks through Regularization
This paper proposed to stabilize the training of GAN using proposed gradient-norm regularizer. This regularization is designed for conventional GAN, or more general f-GAN proposed last year. The idea is interesting but the justification is a little bit coarse. However, the regularizer defined in (21) is for arbitrary \psi, which contradicts this assumption. The authors claim that'the results clearly demonstrate the robustness of the regularizer w.r.t. the various regularization bandwidths'.
Reviews: The Pessimistic Limits and Possibilities of Margin-based Losses in Semi-supervised Learning
Overview and Recommendation: Many popular binary classifiers are defined by convex margin-based surrogate losses such as SVMs and Logistic regression. Designing a semi-supervised learning algorithm for these classifiers, that is guaranteed to improve upon the "lazy" approach of throwing away the unlabeled data and just using the labeled data while training, is of considerable interest, because of the time-consuming experimentation that the use of SSL currently requires. This paper analyzes this problem and the results presented in the paper are primarily of theoretical interest. I had great difficulty in rating the significance of this work, therefore my own confidence rating is only 3. The proofs of the theorems use elementary steps. I checked them in detail and they are correct, but, the significance of the theorems themselves was hard to measure.
Reviews: Good Semi-supervised Learning That Requires a Bad GAN
After reading the rebuttal I changed my score to 7. Overall it is an interesting paper with an interesting idea. Although the theoretical contributions are emphasized I find the empirical findings more appealing. The theory presented in the paper is not convincing (input versus feature, convexity etc). I think the link to classical semi-supervised learning and the cluster assumption should be emphasized, and the * low density assumption on the boundary* as explained in this paper: Semi-Supervised Classification by Low Density Separation Olivier Chapelle, Alexander Zien http://citeseerx.ist.psu.edu/viewdoc/download?doi 10.1.1.76.5826&rep rep1&type pdf I am changing my review to 7, and I hope that the authors will put their contribution in the context of known work in semi-supervised learning, that the boundary of separation should lie in the low density regions . This will put the paper better in context.
Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition
Zhou, Zi-Hao, Fang, Siyuan, Zhou, Zi-Jing, Wei, Tong, Wan, Yuanyu, Zhang, Min-Ling
Long-tailed semi-supervised learning poses a significant challenge in training models with limited labeled data exhibiting a long-tailed label distribution. Current state-of-the-art LTSSL approaches heavily rely on high-quality pseudo-labels for large-scale unlabeled data. However, these methods often neglect the impact of representations learned by the neural network and struggle with real-world unlabeled data, which typically follows a different distribution than labeled data. This paper introduces a novel probabilistic framework that unifies various recent proposals in long-tail learning. Our framework derives the class-balanced contrastive loss through Gaussian kernel density estimation. We introduce a continuous contrastive learning method, CCL, extending our framework to unlabeled data using reliable and smoothed pseudo-labels. By progressively estimating the underlying label distribution and optimizing its alignment with model predictions, we tackle the diverse distribution of unlabeled data in real-world scenarios. Extensive experiments across multiple datasets with varying unlabeled data distributions demonstrate that CCL consistently outperforms prior state-of-the-art methods, achieving over 4% improvement on the ImageNet-127 dataset. Our source code is available at https://github.com/zhouzihao11/CCL
Reviews: Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results
The paper proposes a new method for using unlabeled data in semi-supervised learning. The idea is to construct a teacher network from student network during training by using an exponentially decaying moving average of the weights of the student network, updating after each batch. This is inspired by previous work that uses a temporal ensemble of the softmax outputs, and aims to reduce the variance of the targets during training. Noise of various forms is added to both labelled and unlabeled examples, and a L2 penalty is added to encourage the student outputs to be consistent with the teachers. As the authors mention, this acts as a kind of soft adaptive label propagation mechanism. The advantage of their approach over temporal ensembling is that it can be used in the online setting.
Reviews: Co-teaching: Robust training of deep neural networks with extremely noisy labels
This paper proposes and empirically investigates a co-teaching learning strategy that makes use of two networks to cross-train data with noisy labels for robust learning. Experiments on MNIST, CIFAR10 and CIFAR100 seem to be supportive. The paper is well structured and well written, but technically I have following concerns. Since the proposed co-teaching strategy is motivated by co-training, one can tell the similarity between the two. Typically, in a semi-supervised learning setup by co-training, the labeled and unlabeled data is known.