Jialu Wang Computer Science and Engineering Computer Science and Engineering University of California, Santa Cruz University of California, Santa Cruz Santa Cruz, CA95064

Neural Information Processing Systems 

Supplementary Material Can Less be More? Section A includes omitted proofs for theoretical conclusions in the main paper, as well as the extension to fairness constrained setting (A.9) and multi-class classification (A.10). Section B presents more experimental details and results. Combining all above we finished the proof when e < 0.5 by having: P(h(X) Ŷ) = P(h(X) = Y) e + P(h(X) Y) (1 e) = (1 2e) P(h(X) Y) + e A.3 Proof for Theorem 3 Proof Again let l Again the last equality is reusing Eqn. P(h(X) = +1, Ỹ = +1|Z = a) P(h(X) = +1|Ỹ = +1, Z = a) = (A10) P(Ỹ = +1|Z = a) Again we do the trick of sampling P(Ỹ = +1|Z = a) to be 0.5, which allows us to focus on the numerator.