SureMap: Simultaneous mean estimation for single-task and multi-task disaggregated evaluation

Mar-19-2026, 04:40:48 GMT–Neural Information Processing Systems

Disaggregated evaluation--estimation of performance of a machine learning model on different subpopulations--is a core task when assessing performance and group-fairness of AI systems.A key challenge is that evaluation data is scarce, and subpopulations arising from intersections of attributes (e.g., race, sex, age) are often tiny.Today, it is common for multiple clients to procure the same AI model from a model developer, and the task of disaggregated evaluation is faced by each customer individually. This gives rise to what we call the, wherein multiple clients seek to conduct a disaggregated evaluation of a given model in their own data setting (task).

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Mar-19-2026, 04:40:48 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.97)