Unsupervised or Indirectly Supervised Learning
Return of Unconditional Generation: A Self-supervised Representation Generation Method
Unconditional generation--the problem of modeling data distribution without relying on human-annotated labels--is a long-standing and fundamental challenge in generative models, creating a potential of learning from large-scale unlabeled data. In the literature, the generation quality of an unconditional method has been much worse than that of its conditional counterpart. This gap can be attributed to the lack of semantic information provided by labels. In this work, we show that one can close this gap by generating semantic representations in the representation space produced by a self-supervised encoder. These representations can be used to condition the image generator.
Efficient Generalization via Multimodal Co-Training under Data Scarcity and Distribution Shift
Pan, Tianyu Bell, Woodard, Damon L.
This paper explores a multimodal co-training framework designed to enhance model generalization in situations where labeled data is limited and distribution shifts occur. We thoroughly examine the theoretical foundations of this framework, deriving conditions under which the use of unlabeled data and the promotion of agreement between classifiers for different modalities lead to significant improvements in generalization. We also present a convergence analysis that confirms the effectiveness of iterative co-training in reducing classification errors. In addition, we establish a novel generalization bound that, for the first time in a multimodal co-training context, decomposes and quantifies the distinct advantages gained from leveraging unlabeled multimodal data, promoting inter-view agreement, and maintaining conditional view independence. Our findings highlight the practical benefits of multimodal co-training as a structured approach to developing data-efficient and robust AI systems that can effectively generalize in dynamic, real-world environments. The theoretical foundations are examined in dialogue with, and in advance of, established co-training principles.
CAST: Contrastive Adaptation and Distillation for Semi-Supervised Instance Segmentation
Taghavi, Pardis, Liu, Tian, Li, Renjie, Langari, Reza, Tu, Zhengzhong
Instance segmentation demands costly per-pixel annotations and computationally expensive models. We introduce CAST, a semi-supervised knowledge distillation (SSKD) framework that compresses pre-trained vision foundation models (VFM) into compact experts using limited labeled and abundant unlabeled data. CAST unfolds in three stages: (1) domain adaptation of the VFM(s) via self-training with contrastive calibration, (2) knowledge transfer through a unified multi-objective loss, and (3) student refinement to mitigate residual pseudo-label bias. Central to CAST is an \emph{instance-aware pixel-wise contrastive loss} that fuses mask and class scores to extract informative negatives and enforce clear inter-instance margins. By maintaining this contrastive signal across both adaptation and distillation, we align teacher and student embeddings and fully leverage unlabeled images. On Cityscapes and ADE20K, our ~11x smaller student improves over its zero-shot VFM teacher(s) by +8.5 and +7.1 AP, surpasses adapted teacher(s) by +3.4 and +1.5 AP, and further outperforms state-of-the-art SSKD methods on both benchmarks.
Graph Stochastic Neural Networks for Semi-supervised Learning: Supplemental Material Haibo Wang
The supplemental material includes the following contents. Actually, Eq. (11) can also be derived from Eq. (4) rigorously. The pseudo-code of GSNN is in Algorithm 1. The detailed statistics of three datasets used in this paper are listed in Table 1. In this paper, when evaluating the performance in the standard experimental scenario and in the label-scarce scenario, we compare with six state-of-the-art baselines used for graph-based semi-supervised learning.
Learning with Fredholm Kernels
Qichao Que, Mikhail Belkin, Yusu Wang
In this paper we propose a framework for supervised and semi-supervised learning based on reformulating the learning problem as a regularized Fredholm integral equation. Our approach fits naturally into the kernel framework and can be interpreted as constructing new data-dependent kernels, which we call Fredholm kernels. We proceed to discuss the "noise assumption" for semi-supervised learning and provide both theoretical and experimental evidence that Fredholm kernels can effectively utilize unlabeled data under the noise assumption. We demonstrate that methods based on Fredholm learning show very competitive performance in the standard semi-supervised learning setting.