Goto

Collaborating Authors

 Unsupervised or Indirectly Supervised Learning



Capacity-Net-Based RIS Precoding Design without Channel Estimation for mmWave MIMO System

arXiv.org Artificial Intelligence

In this paper, we propose Capacity-Net, a novel unsupervised learning approach aimed at maximizing the achievable rate in reflecting intelligent surface (RIS)-aided millimeter-wave (mmWave) multiple input multiple output (MIMO) systems. To combat severe channel fading of the mmWave spectrum, we optimize the phase-shifting factors of the reflective elements in the RIS to enhance the achievable rate. However, most optimization algorithms rely heavily on complete and accurate channel state information (CSI), which is often challenging to acquire since the RIS is mostly composed of passive components. To circumvent this challenge, we leverage unsupervised learning techniques with implicit CSI provided by the received pilot signals. Specifically, it usually requires perfect CSI to evaluate the achievable rate as a performance metric of the current optimization result of the unsupervised learning method. Instead of channel estimation, the Capacity-Net is proposed to establish a mapping among the received pilot signals, optimized RIS phase shifts, and the resultant achievable rates. Simulation results demonstrate the superiority of the proposed Capacity-Net-based unsupervised learning approach over learning methods based on traditional channel estimation.


U-Mamba2-SSL for Semi-Supervised Tooth and Pulp Segmentation in CBCT

arXiv.org Artificial Intelligence

Accurate segmentation of teeth and pulp in Cone-Beam Computed Tomography (CBCT) is vital for clinical applications like treatment planning and diagnosis. However, this process requires extensive expertise and is exceptionally time-consuming, highlighting the critical need for automated algorithms that can effectively utilize unlabeled data. In this paper, we propose U-Mamba2-SSL, a novel semi-supervised learning framework that builds on the U-Mamba2 model and employs a multi-stage training strategy. The framework first pre-trains U-Mamba2 in a self-supervised manner using a disruptive autoencoder. It then leverages unlabeled data through consistency regularization, where we introduce input and feature perturbations to ensure stable model outputs. Finally, a pseudo-labeling strategy is implemented with a reduced loss weighting to minimize the impact of potential errors. U-Mamba2-SSL achieved an average score of 0.789 and a DSC of 0.917 on the hidden test set, achieving first place in Task 1 of the STSR 2025 challenge. The code is available at https://github.com/zhiqin1998/UMamba2.


Simple yet Effective Semi-supervised Knowledge Distillation from Vision-Language Models via Dual-Head Optimization

arXiv.org Artificial Intelligence

Semi-supervised learning (SSL) has emerged as a practical solution for addressing data scarcity challenges by leveraging unlabeled data. Recently, vision-language models (VLMs), pre-trained on massive image-text pairs, have demonstrated remarkable zero-/few-shot performance that often surpasses SSL approaches due to their exceptional generalization capabilities. This gap motivates us to question: how can we effectively harness the powerful generalization capabilities of VLMs into task-specific models? Knowledge distillation (KD) offers a natural framework for transferring VLM capabilities, but we identify that it suffers from gradient conflicts between supervised and distillation losses. To address this challenge, we propose Dual-Head Optimization (DHO), which introduces dual prediction heads for each distinct signal. We observe that DHO resolves gradient conflicts, enabling improved feature learning compared to single-head KD baselines, with practical benefits of minimal computational overhead and test-time hyperparameter tuning without retraining. Extensive experiments across 15 datasets show that DHO consistently outperforms KD baselines, often outperforming teacher models with smaller student models. DHO also achieves new state-of-the-art performance on both in-distribution ImageNet semi-supervised learning and out-of-distribution generalization across ImageNet variants. We publicly release our code and model checkpoints to facilitate future research at https://github.com/erjui/DHO.


Optimistic Concurrency Control for Distributed Unsupervised Learning

Neural Information Processing Systems

Research on distributed machine learning algorithms has focused primarily on one of two extremes---algorithms that obey strict concurrency constraints or algorithms that obey few or no such constraints. We consider an intermediate alternative in which algorithms optimistically assume that conflicts are unlikely and if conflicts do arise a conflict-resolution protocol is invoked. We view this optimistic concurrency control'' paradigm as particularly appropriate for large-scale machine learning algorithms, particularly in the unsupervised setting. We demonstrate our approach in three problem areas: clustering, feature learning and online facility location. We evaluate our methods via large-scale experiments in a cluster computing environment.


Correlated random features for fast semi-supervised learning

Neural Information Processing Systems

This paper presents Correlated Nystrom Views (XNV), a fast semi-supervised algorithm for regression and classification. The algorithm draws on two main ideas. First, it generates two views consisting of computationally inexpensive random features. It has been shown that CCA regression can substantially reduce variance with a minimal increase in bias if the views contains accurate estimators. Recent theoretical and empirical work shows that regression with random features closely approximates kernel regression, implying that the accuracy requirement holds for random views.


On a Theory of Nonparametric Pairwise Similarity for Clustering: Connecting Clustering to Classification

Neural Information Processing Systems

The success of pairwise clustering largely depends on the pairwise similarity function defined over the data points, where kernel similarity is broadly used. In this paper, we present a novel pairwise clustering framework by bridging the gap between clustering and multi-class classification. This pairwise clustering framework learns an unsupervised nonparametric classifier from each data partition, and search for the optimal partition of the data by minimizing the generalization error of the learned classifiers associated with the data partitions. We consider two nonparametric classifiers in this framework, i.e. the nearest neighbor classifier and the plug-in classifier. Modeling the underlying data distribution by nonparametric kernel density estimation, the generalization error bounds for both unsupervised nonparametric classifiers are the sum of nonparametric pairwise similarity terms between the data points for the purpose of clustering. Under uniform distribution, the nonparametric similarity terms induced by both unsupervised classifiers exhibit a well known form of kernel similarity. We also prove that the generalization error bound for the unsupervised plug-in classifier is asymptotically equal to the weighted volume of cluster boundary for Low Density Separation, a widely used criteria for semi-supervised learning and clustering. Based on the derived nonparametric pairwise similarity using the plug-in classifier, we propose a new nonparametric exemplar-based clustering method with enhanced discriminative capability, whose superiority is evidenced by the experimental results.


Learning with Fredholm Kernels

Neural Information Processing Systems

In this paper we propose a framework for supervised and semi-supervised learning based on reformulating the learning problem as a regularized Fredholm integral equation. Our approach fits naturally into the kernel framework and can be interpreted as constructing new data-dependent kernels, which we call Fredholm kernels. We proceed to discuss the noise assumption for semi-supervised learning and provide evidence evidence both theoretical and experimental that Fredholm kernels can effectively utilize unlabeled data under the noise assumption. We demonstrate that methods based on Fredholm learning show very competitive performance in the standard semi-supervised learning setting.


ClearVision: Leveraging CycleGAN and SigLIP-2 for Robust All-Weather Classification in Traffic Camera Imagery

arXiv.org Artificial Intelligence

Adverse weather conditions challenge safe transportation, necessitating robust real-time weather detection from traffic camera imagery. We propose a novel framework combining CycleGAN-based domain adaptation with efficient contrastive learning to enhance weather classification, particularly in low-light nighttime conditions. Our approach leverages the lightweight SigLIP-2 model, which employs pairwise sigmoid loss to reduce computational demands, integrated with CycleGAN to transform nighttime images into day-like representations while preserving weather cues. Evaluated on an Iowa Department of Transportation dataset, the baseline EVA-02 model with CLIP achieves a per-class overall accuracy of 96.55\% across three weather conditions (No Precipitation, Rain, Snow) and a day/night overall accuracy of 96.55\%, but shows a significant day-night gap (97.21\% day vs.\ 63.40\% night). With CycleGAN, EVA-02 improves to 97.01\% per-class accuracy and 96.85\% day/night accuracy, boosting nighttime performance to 82.45\%. Our Vision-SigLIP-2 + Text-SigLIP-2 + CycleGAN + Contrastive configuration excels in nighttime scenarios, achieving the highest nighttime accuracy of 85.90\%, with 94.00\% per-class accuracy and 93.35\% day/night accuracy. This model reduces training time by 89\% (from 6 hours to 40 minutes) and inference time by 80\% (from 15 seconds to 3 seconds) compared to EVA-02. By narrowing the day-night performance gap from 33.81 to 8.90 percentage points, our framework provides a scalable, efficient solution for all-weather classification using existing camera infrastructure.


Whom to Trust? Adaptive Collaboration in Personalized Federated Learning

arXiv.org Artificial Intelligence

Data heterogeneity poses a fundamental challenge in federated learning (FL), especially when clients differ not only in distribution but also in the reliability of their predictions across individual examples. While personalized FL (PFL) aims to address this, we observe that many PFL methods fail to outperform two necessary baselines, local training and centralized training. This suggests that meaningful personalization only emerges in a narrow regime, where global models are insufficient, but collaboration across clients still holds value. Our empirical findings point to two key ingredients for success in this regime: adaptivity in collaboration and fine-grained trust, at the level of individual examples. We show that these properties can be achieved within federated semi-supervised learning, where clients exchange predictions over a shared unlabeled dataset. This enables each client to align with public consensus when it is helpful, and disregard it when it is not, without sharing model parameters or raw data. As a concrete realization of this idea, we develop FEDMOSAIC, a personalized co-training method where clients reweight their loss and their contribution to pseudo-labels based on per-example agreement and confidence. FEDMOSAIC outperforms strong FL and PFL baselines across a range of non-IID settings, and we prove convergence under standard smoothness, bounded-variance, and drift assumptions. In contrast to many of these baselines, it also outperforms local and centralized training. These results clarify when federated personalization can be effective, and how fine-grained, trust-aware collaboration enables it.