Pearls from Pebbles: Improved Confidence Functions for Auto-labeling

May-26-2025, 17:57:35 GMT–Neural Information Processing Systems

Auto-labeling is an important family of techniques that produce labeled training sets with minimum manual annotation. A prominent variant, threshold-based auto-labeling (TBAL), works by finding thresholds on a model's confidence scores above which it can accurately automatically label unlabeled data. However, many models are known to produce overconfident scores, leading to poor TBAL performance. While a natural idea is to apply off-the-shelf calibration methods to alleviate the overconfidence issue, we show that such methods fall short. Rather than experimenting with ad-hoc choices of confidence functions, we propose a framework for studying the optimal TBAL confidence function.

artificial intelligence, auto-labeling, machine learning, (3 more...)

Neural Information Processing Systems

May-26-2025, 17:57:35 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)