Goto

Collaborating Authors

 Unsupervised or Indirectly Supervised Learning


Reviews: MixMatch: A Holistic Approach to Semi-Supervised Learning

Neural Information Processing Systems

The reviewers are in consensus that this is well-written paper, which combine a number of well-studied SSL methods. The results include good performance over a number of datasets. Thus the recommendation to accept this paper.


Evaluating multiple models using labeled and unlabeled data

arXiv.org Artificial Intelligence

It remains difficult to evaluate machine learning classifiers in the absence of a large, labeled dataset. While labeled data can be prohibitively expensive or impossible to obtain, unlabeled data is plentiful. Here, we introduce Semi-Supervised Model Evaluation (SSME), a method that uses both labeled and unlabeled data to evaluate machine learning classifiers. SSME is the first evaluation method to take advantage of the fact that: (i) there are frequently multiple classifiers for the same task, (ii) continuous classifier scores are often available for all classes, and (iii) unlabeled data is often far more plentiful than labeled data. The key idea is to use a semi-supervised mixture model to estimate the joint distribution of ground truth labels and classifier predictions. We can then use this model to estimate any metric that is a function of classifier scores and ground truth labels (e.g., accuracy or expected calibration error). We present experiments in four domains where obtaining large labeled datasets is often impractical: (1) healthcare, (2) content moderation, (3) molecular property prediction, and (4) image annotation. Our results demonstrate that SSME estimates performance more accurately than do competing methods, reducing error by 5.1 relative to using labeled data alone and 2.4 relative to the next best competing method. SSME also improves accuracy when evaluating performance across subsets of the test distribution (e.g., specific demographic subgroups) and when evaluating the performance of language models. Rigorous evaluation is essential to the safe deployment of machine learning classifiers. The standard approach is to measure classifier performance using a large labeled dataset. In practice, however, labeled data is often scarce (Culotta & McCallum, 2005; Dutta & Das, 2023). Exacerbating the challenge of evaluation, the number of off-the-shelf classifiers has increased dramatically through the widespread usage of model hubs. The modern machine learning practitioner thus has a myriad of trained models, but little labeled data with which to evaluate them. In many domains, unlabeled data is much more abundant than labeled data (Bepler et al., 2019; Sagawa et al., 2021; Movva et al., 2024).


Review for NeurIPS paper: FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence

Neural Information Processing Systems

I cite from ReMixMatch figure caption: "Augmentation anchoring. We use the prediction for a weakly augmented image (green, middle) as the target for predictions on strong augmentations of the same image". This sounds to me as a summary of the presented work, and as such I consider it a special case of the ReMixMatch. Authors have discussed the differences between their work and ReMixMatch, mentioning that (1) "ReMixMatch don t use pseudo labeling", and (2) ReMixMatch uses sharpening of pseudolabels and weight annealing of the unlabeled data loss. However, in section 3.2.1 of ReMixMatch, it is stated that the guessed labels are used as targets (for strongly augmented images) using cross-entropy loss.


Reviews: Cross Attention Network for Few-shot Classification

Neural Information Processing Systems

Post rebuttal I'd like to thank the authors for performing the additional ablation, comparisons via visualizations, and experiment in a cluttered environment, as I suggested in my reviews. I think these additional results would be good additions (to the Appendix at the very least) and strengthen the paper. I continue to recommend acceptance. I do agree with R3 though that the proposed transductive method is very similar to previous works for semi-supervised learning, and it would be useful to be more clear about this in the writing. Before rebuttal Summary This paper introduces a state-of-the-art approach to few-shot classification. There are two orthogonal components proposed: the first influences the embedding function applied to the images of an episode, and the second introduces a strategy for using the query set of each episode in a transductive manner as additional unlabeled data for refining the within-episode classifier.


Reviews: Unsupervised Learning from Noisy Networks with Applications to Hi-C Data

Neural Information Processing Systems

I believe the review of this paper should be done in 2 stages: 1) method; 2) application. The method, as presented, is fairly general and could be applied to many different scenarios. It is a relatively novel method for network de-noising – combining multiple networks from noisy observations of the true underlying networks, in particular network that is made of more or less clear clusters. In this context the method is well described. I would be interested to know how well does it scale – the complexity and running time of the method on networks of various size.


Reviews: Unsupervised Learning for Physical Interaction through Video Prediction

Neural Information Processing Systems

The paper mentions appendices and a link to videos, but the link does not work and the appendices are not included in the main file nor the supplementary material. I am reviewing the paper as it is. The descriptions of the three methods need to be expanded. The method descriptions rely heavily on Figure 1 and assume that most of the details are clear from the network structure. I did not find this to be the case.


Reviews: A Non-generative Framework and Convex Relaxations for Unsupervised Learning

Neural Information Processing Systems

The introduction claims that this approach to unsupervised learning removes generative assumptions that have been common in the area. I do agree that the unified formulation has many desirable properties, including the notion of excess risk and the lack of assumptions on the data generating process. However, for the PSCA problem, the paper does make a generative assumption, namely the regularly spectral decodable assumption. And for the dictionary learning problem, the paper changes the formulation somewhat substantially to allow for group encoding/decoding. So the paper fails to provide strong evidence supporting the unified view of unsupervised learning through CONUS-learnability.


Reviews: Unsupervised Learning of Spoken Language with Visual Context

Neural Information Processing Systems

This is interesting work that is pointing into the right direction, but a few aspects of this paper are a bit problematic: 1) It would have been useful (or interesting) to use a corpus that has existing text captions, and either have users re-speak the text captions, or collect additional captions. The data collections seems generally well thought-out, but why was the Places205 data set used? Prompted speech (such as collected here) is not "spontaneous", otherwise the WSJ recognizer would not have given 20 % WER (this aspect is irrelevant for the purpose of this paper, though, I think). Typically, multiple captions are being generated for a single image. Has this been done here as well? Or is there only a single caption for each image?


Reviews: Estimating the class prior and posterior from noisy positives and unlabeled data

Neural Information Processing Systems

I note that the first method in particular also does not require the use of calibrated probabilities, but rather a probabilistically consistent ranker (as the estimate is based on a derivative of the RHS of the ROC curve). Overall, I think the paper's contribution is reasonable, being an extension of (Jain et al., 2016) to the noisy PU case, but I think the novelty is a weak point. Other comments: - section 3 is said to include "a few missing results needed for our approach" -- explicitly identifying which these are seems prudent.


Reviews: Coupled Generative Adversarial Networks

Neural Information Processing Systems

I am in two minds about the paper. On the one hand the idea seems very interesting and powerful as it does not seem to rely on pairs of corresponding training images. The authors make a good effort of demonstrating and analyzing the capabilities of the approach in several ways (I especially like the unsupervised domain adaptation results). Visually the results look promising. But on the other hand, it seems to me that the task is not well defined. Since the the model is never presented with corresponding image pairs there is actually nothing in the training data that establishes what "corresponding" means.