ULF: Unsupervised Labeling Function Correction using Cross-Validation for Weak Supervision
Sedova, Anastasiia, Roth, Benjamin
–arXiv.org Artificial Intelligence
A cost-effective alternative to manual data labeling is weak supervision (WS), where data samples are automatically annotated using a predefined set of labeling functions (LFs), rule-based mechanisms that generate artificial labels for the associated classes. In this work, we investigate noise reduction techniques for WS based on the principle of k-fold cross-validation. We introduce a new algorithm ULF for Unsupervised Labeling Function correction, which denoises WS data by leveraging models trained on all but some LFs to identify and correct biases specific to the held-out LFs. Specifically, ULF refines the allocation of LFs to classes by re-estimating this assignment on highly reliable cross-validated samples. Evaluation on multiple datasets confirms ULF's effectiveness in enhancing WS learning without the need for manual labeling.
arXiv.org Artificial Intelligence
Jan-3-2024
- Country:
- Africa (0.04)
- Asia > Myanmar
- Tanintharyi Region > Dawei (0.04)
- Europe
- Austria > Vienna (0.14)
- Slovenia > Drava
- Municipality of Benedikt > Benedikt (0.04)
- Switzerland (0.04)
- North America > United States (0.05)
- Genre:
- Research Report (1.00)
- Industry:
- Health & Medicine (0.46)
- Technology: