AITopics | Unsupervised or Indirectly Supervised Learning

Despite the empirical success and practical significance of (relational) knowledge distillation that matches (the relations of) features between teacher and student models, the corresponding theoretical interpretations remain limited for various knowledge distillation paradigms. In this work, we take an initial step toward a theoretical understanding of relational knowledge distillation (RKD), with a focus on semi-supervised classification problems. We start by casting RKD as spectral clustering on a population-induced graph unveiled by a teacher model. Via a notion of clustering error that quantifies the discrepancy between the predicted and ground truth clusterings, we illustrate that RKD over the population provably leads to low clustering error. Moreover, we provide a sample complexity bound for RKD with limited unlabeled samples. For semi-supervised learning, we further demonstrate the label efficiency of RKD through a general framework of cluster-aware semi-supervised learning that assumes low clustering errors. Finally, by unifying data augmentation consistency regularization into this cluster-aware framework, we show that despite the common effect of learning accurate clusterings, RKD facilitates a "global" perspective through spectral clustering, whereas consistency regularization focuses on a "local" perspective via expansion.

artificial intelligence, inductive learning, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.28)
North America > Canada > Ontario (0.28)

Industry: Education (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

title

Yijun Dong

Neural Information Processing SystemsFeb-11-2025, 00:53:38 GMT

Despite the empirical success and practical significance of (relational) knowledge distillation that matches (the relations of) features between teacher and student models, the corresponding theoretical interpretations remain limited for various knowledge distillation paradigms. In this work, we take an initial step toward a theoretical understanding of relational knowledge distillation (RKD), with a focus on semi-supervised classification problems. We start by casting RKD as spectral clustering on a population-induced graph unveiled by a teacher model. Via a notion of clustering error that quantifies the discrepancy between the predicted and ground truth clusterings, we illustrate that RKD over the population provably leads to low clustering error. Moreover, we provide a sample complexity bound for RKD with limited unlabeled samples. For semi-supervised learning, we further demonstrate the label efficiency of RKD through a general framework of cluster-aware semi-supervised learning that assumes low clustering errors. Finally, by unifying data augmentation consistency regularization into this cluster-aware framework, we show that despite the common effect of learning accurate clusterings, RKD facilitates a "global" perspective through spectral clustering, whereas consistency regularization focuses on a "local" perspective via expansion.

artificial intelligence, inductive learning, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.28)
North America > Canada > Ontario (0.28)

Industry: Education (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

Universal Semi-Supervised Learning Zhuo Huang

Neural Information Processing SystemsFeb-11-2025, 00:04:56 GMT

Universal Semi-Supervised Learning (UniSSL) aims to solve the open-set problem where both the class distribution (i.e., class set) and feature distribution (i.e., feature domain) are different between labeled dataset and unlabeled dataset. Such a problem seriously hinders the realistic landing of classical SSL. Different from the existing SSL methods targeting at the open-set problem that only study one certain scenario of class distribution mismatch and ignore the feature distribution mismatch, we consider a more general case where a mismatch exists in both class and feature distribution. In this case, we propose a "Class-shAring data detection and Feature Adaptation" (CAFA) framework which requires no prior knowledge of the class relationship between the labeled dataset and unlabeled dataset. Particularly, CAFA utilizes a novel scoring strategy to detect the data in the shared class set. Then, it conducts domain adaptation to fully exploit the value of the detected class-sharing data for better semi-supervised consistency training. Exhaustive experiments on several benchmark datasets show the effectiveness of our method in tackling open-set problems.

artificial intelligence, inductive learning, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

OpenMatch: Open-set Consistency Regularization for Semi-supervised Learning with Outliers Donghyun Kim

Neural Information Processing SystemsFeb-10-2025, 23:00:23 GMT

Semi-supervised learning (SSL) is an effective means to leverage unlabeled data to improve a model's performance. Typical SSL methods like FixMatch assume that labeled and unlabeled data share the same label space. However, in practice, unlabeled data can contain categories unseen in the labeled set, i.e., outliers, which can significantly harm the performance of SSL algorithms. To address this problem, we propose a novel Open-set Semi-Supervised Learning (OSSL) approach called OpenMatch. Learning representations of inliers while rejecting outliers is essential for the success of OSSL. To this end, OpenMatch unifies FixMatch with novelty detection based on one-vs-all (OVA) classifiers. The OVA-classifier outputs the confidence score of a sample being an inlier, providing a threshold to detect outliers. Another key contribution is an open-set soft-consistency regularization loss, which enhances the smoothness of the OVA-classifier with respect to input transformations and greatly improves outlier detection. OpenMatch achieves stateof-the-art performance on three datasets, and even outperforms a fully supervised model in detecting outliers unseen in unlabeled data on CIFAR10.

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)

Add feedback

Dense Unsupervised Learning for Video Segmentation Nikita Araslanov Simone Schaub-Meyer 1 Stefan Roth Department of Computer Science, TU Darmstadt

Neural Information Processing SystemsFeb-10-2025, 21:34:15 GMT

We present a novel approach to unsupervised learning for video object segmentation (VOS). Unlike previous work, our formulation allows to learn dense feature representations directly in a fully convolutional regime. We rely on uniform grid sampling to extract a set of anchors and train our model to disambiguate between them on both inter-and intra-video levels. However, a naive scheme to train such a model results in a degenerate solution. We propose to prevent this with a simple regularisation scheme, accommodating the equivariance property of the segmentation task to similarity transformations. Our training objective admits efficient implementation and exhibits fast training convergence. On established VOS benchmarks, our approach exceeds the segmentation accuracy of previous work despite using significantly less training data and compute power.

artificial intelligence, machine learning, representation, (16 more...)

Neural Information Processing Systems

Country: Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.40)

Genre:

Research Report (0.48)
Overview (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Unsupervised Learning of Shape Programs with Repeatable Implicit Parts Supplementary Materials Boyang Deng 1, Zhengyang Dong 1 Congyue Deng

Neural Information Processing SystemsFeb-10-2025, 21:05:13 GMT

We use an AdamW [5] optimizer with a learning rate of 0.0001 and a batch size of 32. We train the model for 75 epochs which is ~2 hours on an Nvidia Titan RTX. At the second stage, to train the implicit functions, we use the sample points preprocessed by [6] and [3].

artificial intelligence, machine learning, progrip, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.51)

Add feedback

Unsupervised Representation Transfer for Small Networks: I Believe I Can Distill On-the-Fly

Neural Information Processing SystemsFeb-10-2025, 19:44:50 GMT

A current remarkable improvement of unsupervised visual representation learning is based on heavy networks with large-batch training. While recent methods have greatly reduced the gap between supervised and unsupervised performance of deep models such as ResNet-50, this development has been relatively limited for small models. In this work, we propose a novel unsupervised learning framework for small networks that combines deep self-supervised representation learning and knowledge distillation within one-phase training. In particular, a teacher model is trained to produce consistent cluster assignments between different views of the same image. Simultaneously, a student model is encouraged to mimic the prediction of on-the-fly self-supervised teacher. For effective knowledge transfer, we adopt the idea of domain classifier so that student training is guided by discriminative features invariant to the representational space shift between teacher and student. We also introduce a network driven multi-view generation paradigm to capture rich feature information contained in the network itself. Extensive experiments show that our student models surpass state-of-the-art offline distilled networks even from stronger self-supervised teachers as well as top-performing self-supervised models. Notably, our ResNet-18, trained with ResNet-50 teacher, achieves 68.3% ImageNet Top-1 accuracy on frozen feature linear evaluation, which is only 1.5% below the supervised baseline.

artificial intelligence, machine learning, proceedings, (15 more...)

Neural Information Processing Systems

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Deferred Proofs From Section 3 We begin by reviewing standard concepts pertaining to establishing statistical query lower bounds for unsupervised learning problems, as developed in [FGR

Neural Information Processing SystemsFeb-10-2025, 17:43:30 GMT

In Section A we provide the deferred proofs from Section 3. In Section B we provide the deferred proofs from Section 4. In Section C we show how to extend our main lower bound to apply to learning in Wasserstein distance. In Section D we fill in the details alluded to in the introduction for how supervised learning lower bounds can imply unsupervised learning lower bounds. Note that when p = q, this is simply the chi-squared divergence between p and r. Let Z be a distributional search problem over distributions D and solutions F. For γ, β, if N = SD(Z, γ, β), then any statistical query algorithm for Z requires at least Nγ/(β γ) queries to STAT( 2γ) or VSTAT(1/(6γ)). We will use the following two lemmas from [DKS17] and [DK20].

artificial intelligence, lemma 4, machine learning, (15 more...)

Neural Information Processing Systems

Industry: Education > Focused Education > Special Education (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.60)

Add feedback

Semi-supervised Learning with Deep Generative Models

Durk P. Kingma, Shakir Mohamed, Danilo Jimenez Rezende, Max Welling

Neural Information Processing SystemsFeb-9-2025, 20:46:50 GMT

The ever-increasing size of modern data sets combined with the difficulty of obtaining label information has made semi-supervised learning one of the problems of significant practical importance in modern data analysis. We revisit the approach to semi-supervised learning with generative models and develop new models that allow for effective generalisation from small labelled data sets to large unlabelled ones. Generative approaches have thus far been either inflexible, inefficient or non-scalable. We show that deep generative models and approximate Bayesian inference exploiting recent advances in variational methods can be used to provide significant improvements, making generative approaches highly competitive for semi-supervised learning.

artificial intelligence, inductive learning, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.61)

Add feedback

Review for NeurIPS paper: Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning

Neural Information Processing SystemsFeb-8-2025, 18:08:54 GMT

The paper focuses on efficiently exploring MDPs with high dimensional state representations, by combining an unsupervised algorithm for learning a low-dimensional representation and then solving the problem in this low-dimensional space. The paper is largely theoretic and show that in certain conditions, near-optimal policies can be found with polynomial complexity in the number of latent states. The reviewers mostly agreed on the following points. The paper is considered well-written, and presents theoretically strong results that are sound, novel, and non-trivial. As weaknesses of the paper the reviewers mentioned the lack of empirical results in more realistic settings and restrictive assumptions.

learning, provably efficient exploration, reinforcement learning, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

Filters

Collaborating Authors

Unsupervised or Indirectly Supervised Learning

title

title

Universal Semi-Supervised Learning Zhuo Huang

OpenMatch: Open-set Consistency Regularization for Semi-supervised Learning with Outliers Donghyun Kim

Dense Unsupervised Learning for Video Segmentation Nikita Araslanov Simone Schaub-Meyer 1 Stefan Roth Department of Computer Science, TU Darmstadt

Unsupervised Learning of Shape Programs with Repeatable Implicit Parts Supplementary Materials Boyang Deng 1, Zhengyang Dong 1 Congyue Deng

Unsupervised Representation Transfer for Small Networks: I Believe I Can Distill On-the-Fly

A Deferred Proofs From Section 3 We begin by reviewing standard concepts pertaining to establishing statistical query lower bounds for unsupervised learning problems, as developed in [FGR

Semi-supervised Learning with Deep Generative Models

Review for NeurIPS paper: Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning