Reviews: When Worlds Collide: Integrating Different Counterfactual Assumptions in Fairness
–Neural Information Processing Systems
This paper tackles the primary criticism aimed at applications of causal graphical models for fairness: one needs to completely believe an assumed causal model for the results to be valid. Instead, it presents a definition of fairness where we can assume many plausible causal models and requires fairness violations to be bounded below a threshold for all such plausible models. The authors present a simple way to formally express this idea: by defining an approximate notion of counterfactual fairness and using the amount of fairness violation as a regularizer for a supervised learner. This is an important theoretical advance and I think can lead to promising work. The key part, then, is to develop a method to construct counterfactual estimates. This is a hard problem because even for a single causal model, there might be unknown and unobserved confounders that affect relationships between observed variables.
Neural Information Processing Systems
Oct-7-2024, 14:01:55 GMT