subpopulation
SureMap: Simultaneous mean estimation for single-task and multi-task disaggregated evaluation
Disaggregated evaluation--estimation of performance of a machine learning model on different subpopulations--is a core task when assessing performance and group-fairness of AI systems.A key challenge is that evaluation data is scarce, and subpopulations arising from intersections of attributes (e.g., race, sex, age) are often tiny.Today, it is common for multiple clients to procure the same AI model from a model developer, and the task of disaggregated evaluation is faced by each customer individually. This gives rise to what we call the, wherein multiple clients seek to conduct a disaggregated evaluation of a given model in their own data setting (task).
When Worlds Collide: Integrating Different Counterfactual Assumptions in Fairness
Machine learning is now being used to make crucial decisions about people's lives. For nearly all of these decisions there is a risk that individuals of a certain race, gender, sexual orientation, or any other subpopulation are unfairly discriminated against. Our recent method has demonstrated how to use techniques from counterfactual inference to make predictions fair across different subpopulations. This method requires that one provides the causal model that generated the data at hand. In general, validating all causal implications of the model is not possible without further assumptions.