Be Intentional About Fairness!: Fairness, Size, and Multiplicity in the Rashomon Set

Dai, Gordon, Ravishankar, Pavan, Yuan, Rachel, Neill, Daniel B., Black, Emily

arXiv.org Artificial Intelligence 

This phenomenon--often called the Rashomon effect [7], predictive multiplicity [22], or model multiplicity [5]--has wide-ranging implications for both understanding and improving fairness, as these equally accurate models often differ substantially in other properties such as fairness [21, 28] or model simplicity [29-31]. As prior work has pointed out, this multiplicity of models can be viewed as both a fairness opportunity and a concern [5, 10]. On the positive side, legal scholarship has pointed to the fact that model multiplicity is relevant to how to interpret and enforce U.S. anti-discrimination law, and specifically, can strengthen the disparate impact doctrine to more effectively combat algorithmic discrimination [3]. In a recent paper, Black et al. [3] suggest that the phenomenon of model multiplicity could support a reading of the disparate impact doctrine that requires companies to proactively search the set of equally accurate models for less discriminatory alternatives that have equivalent accuracy to a base model deemed acceptable for deployment from a model performance perspective. On the negative side, several scholars have pointed out that facially similar models, with equivalent accuracy but differences in their individual predictions, can suggest that some model decisions are arbitrary since they seem to be made on the basis of model choice that does not impact performance (e.g., a <1% change in a model's training set accuracy) [2, 17, 22]. This arbitrariness can impact model explanations and recourse as well: individuals with decisions that are unstable across small model changes may not receive reliable explanations for their model outcome, or ways to change it [4, 6, 25]. Further, if there is a group-based asymmetry of arbitrariness-e.g., if female loan applicants have more arbitrariness in their decisions than male loan applicants-- this could lead to a group-based equity concern in and of itself. Understanding the extent of the benefits and risks of model multiplicity relies upon an understanding of the properties of the Rashomon set, or the set of approximately equally accurate models for a given prediction task, i.e., equally accurate up to