Reviews: Counterfactual Fairness

Neural Information Processing Systems 

This paper presents an interesting and valuable contribution to the small but growing literature on fairness in machine learning. Specifically, it provides at least three contributions: (1) a definition of counter factual fairness; (2) an algorithm for learning a model under counter factual fairness; and (3) experiments with that algorithm. The value of the contributions of the current paper is sufficient for acceptance, though significant improvements could be made in the clarity of exposition of the algorithm and the extent of the experimentation with the algorithm. Section 4.2 outlines why is is likely that very strong assumptions will need to be made to effectively estimate a model of Y under counterfactual fairness. The assumptions (and the implied analysis techniques) suggest conclusions that will not be particularly robust to violations of those assumptions.