What should an AI assessor optimise for?
Romero-Alvarado, Daniel, Martínez-Plumed, Fernando, Hernández-Orallo, José
–arXiv.org Artificial Intelligence
An AI assessor is an external, ideally indepen-dent system that predicts an indicator, e.g., a loss value, of another AI system. Assessors can lever-age information from the test results of many other AI systems and have the flexibility of be-ing trained on any loss function or scoring rule: from squared error to toxicity metrics. Here we address the question: is it always optimal to train the assessor for the target metric? Or could it be better to train for a different metric and then map predictions back to the target metric? Us-ing twenty regression and classification problems with tabular data, we experimentally explore this question for, respectively, regression losses and classification scores with monotonic and non-monotonic mappings and find that, contrary to intuition, optimising for more informative met-rics is not generally better. Surprisingly, some monotonic transformations are promising. For example, the logistic loss is useful for minimis-ing absolute or quadratic errors in regression, and the logarithmic score helps maximise quadratic or spherical scores in classification.
arXiv.org Artificial Intelligence
Feb-1-2025
- Country:
- Europe (0.46)
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Health & Medicine > Therapeutic Area (0.46)
- Leisure & Entertainment > Games (0.35)
- Technology: