Learning Label Refinement and Threshold Adjustment for Imbalanced Semi-Supervised Learning

Li, Zeju, Zheng, Ying-Qiu, Chen, Chen, Jbabdi, Saad

Jul-7-2024–arXiv.org Artificial Intelligence

Semi-supervised learning (SSL) algorithms struggle to perform well when exposed to imbalanced training data. In this scenario, the generated pseudo-labels can exhibit a bias towards the majority class, and models that employ these pseudo-labels can further amplify this bias. Here we investigate pseudo-labeling strategies for imbalanced SSL including pseudo-label refinement and threshold adjustment, through the lens of statistical analysis. We find that existing SSL algorithms which generate pseudo-labels using heuristic strategies or uncalibrated model confidence are unreliable when imbalanced class distributions bias pseudo-labels. To address this, we introduce SEmi-supervised learning with pseudo-label optimization based on VALidation data (SEVAL) to enhance the quality of pseudo-labelling for imbalanced SSL. We propose to learn refinement and thresholding parameters from a partition of the training dataset in a class-balanced way. SEVAL adapts to specific tasks with improved pseudo-labels accuracy and ensures pseudo-labels correctness on a per-class basis. Our experiments show that SEVAL surpasses state-of-the-art SSL methods, delivering more accurate and effective pseudo-labels in various imbalanced SSL situations. SEVAL, with its simplicity and flexibility, can enhance various SSL techniques effectively.

artificial intelligence, inductive learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

Jul-7-2024

arXiv.org PDF

Add feedback

Country:
- Europe
  - Germany (0.14)
  - United Kingdom (0.14)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Inductive Learning (0.93)
  - Neural Networks (0.93)
  - Performance Analysis > Accuracy (0.67)
  - Unsupervised or Indirectly Supervised Learning (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found