AQuA: A Benchmarking Tool for Label Quality Assessment
Goswami, Mononito, Sanil, Vedant, Choudhry, Arjun, Srinivasan, Arvind, Udompanyawit, Chalisa, Dubrawski, Artur
–arXiv.org Artificial Intelligence
Machine learning (ML) models are only as good as the data they are trained on. But recent studies have found datasets widely used to train and evaluate ML models, e.g. ImageNet, to have pervasive labeling errors. Erroneous labels on the train set hurt ML models' ability to generalize, and they impact evaluation and model selection using the test set. Consequently, learning in the presence of labeling errors is an active area of research, yet this field lacks a comprehensive benchmark to evaluate these methods. Most of these methods are evaluated on a few computer vision datasets with significant variance in the experimental protocols. With such a large pool of methods and inconsistent evaluation, it is also unclear how ML practitioners can choose the right models to assess label quality in their data. To this end, we propose a benchmarking environment AQuA to rigorously evaluate methods that enable machine learning in the presence of label noise. We also introduce a design space to delineate concrete design choices of label error detection models. We hope that our proposed design space and benchmark enable practitioners to choose the right tools to improve their label quality and that our benchmark enables objective and rigorous evaluation of machine learning tools facing mislabeled data.
arXiv.org Artificial Intelligence
Jun-15-2023
- Country:
- North America > United States
- Massachusetts (0.04)
- Florida > Broward County (0.04)
- Pennsylvania > Allegheny County
- Pittsburgh (0.04)
- Oregon > Multnomah County
- Portland (0.04)
- California
- San Diego County > San Diego (0.04)
- Orange County > Irvine (0.04)
- Europe
- Slovenia > Drava
- Municipality of Benedikt > Benedikt (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Slovenia > Drava
- Asia
- China (0.04)
- Myanmar > Tanintharyi Region
- Dawei (0.04)
- Middle East
- Lebanon (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Israel > Tel Aviv District
- Tel Aviv (0.04)
- Africa > Ethiopia
- Addis Ababa > Addis Ababa (0.04)
- North America > United States
- Genre:
- Research Report > New Finding (0.45)
- Industry:
- Government > Regional Government (0.92)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.92)
- Information Technology (0.67)
- Health & Medicine
- Therapeutic Area (1.00)
- Diagnostic Medicine > Imaging (0.45)
- Technology: