An Axiomatic Theory of Provably-Fair Welfare-Centric Machine Learning

Neural Information Processing Systems 

We address an inherent difficulty in welfare-theoretic fair machine learning (ML), by proposing an equivalently-axiomatically justified alternative setting, and studying the resulting computational and statistical learning questions. Welfare metrics quantify overall wellbeing across a population of groups, and welfare-based objectives and constraints have recently been proposed to incentivize fair ML methods to satisfy their diverse needs. However, many ML problems are cast as loss minimization tasks, rather than utility maximization, and thus require nontrivial modeling to construct utility functions. We define a complementary metric, termed malfare, measuring overall societal harm, with axiomatic justification via the standard axioms of cardinal welfare, and cast fair ML as malfare minimization over the risk values (expected losses) of each group. Surprisingly, the axioms of cardinal welfare (malfare) dictate that this is not equivalent to simply defining utility as negative loss and maximizing welfare.