Reviews: Does mitigating ML's impact disparity require treatment disparity?

Neural Information Processing Systems 

This paper tackles a class of algorithms defined as Disparate Learning Processes (DLP) which use the sensitive feature while training and then make predictions without access at the sensitive feature. DLPs have appeared in multiple prior works, and the authors argue that DLPs do not necessarily guarantee treatment parity, which could then hurt impact parity. The theoretical analysis focuses on relating treatment disparity to utility and then optimal decision rules for various conditions. Most notably the per-group thresholding yields optimal rules to reduce the CV gap. As outlined in the beginning of section 4, the theoretical advantages of DLPs seems to optimality, rational ordering, and "no additional harm" to the protected group.