Enhancing Self-Training Methods

Radhakrishnan, Aswathnarayan, Davis, Jim, Rabin, Zachary, Lewis, Benjamin, Scherreik, Matthew, Ilin, Roman

Jan-17-2023–arXiv.org Artificial Intelligence

Semi-supervised learning approaches train on small sets of labeled data along with large sets of unlabeled data. Self-training is a semi-supervised teacher-student approach that often suffers from the problem of "confirmation bias" that occurs when the student model repeatedly overfits to incorrect pseudo-labels given by the teacher model for the unlabeled data. This bias impedes improvements in pseudo-label accuracy across self-training iterations, leading to unwanted saturation in model performance after just a few iterations. In this work, we describe multiple enhancements to improve the self-training pipeline to mitigate the effect of confirmation bias. We evaluate our enhancements over multiple datasets showing performance gains over existing self-training design choices. Finally, we also study the extendability of our enhanced approach to Open Set unlabeled data (containing classes not seen in labeled data).

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

Jan-17-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Ohio (0.04)
  - Wisconsin > Dane County
    - Madison (0.04)

Genre:
- Research Report (0.40)

Industry:
- Education (0.90)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Unsupervised or Indirectly Supervised Learning (0.97)
  - Neural Networks > Deep Learning (0.69)
  - Performance Analysis > Accuracy (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found