OvercomingCommonFlawsintheEvaluationof SelectiveClassificationSystems

Neural Information Processing Systems 

Whilecurrentevaluationofthese systems typically assumes fixed working points based on pre-defined rejection thresholds, methodological progress requires benchmarking the general performance of systems akin to the AUROC in standard classification. In this work, we define 5 requirements for multi-threshold metrics in selective classification regarding task alignment, interpretability, and flexibility, and show how current approaches fail to meet them.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found