A Missing Proofs

Neural Information Processing Systems 

Proposition 2. F or a given group a A, gradient norms can be upper bounded as: g Proposition 3. Consider a binary classifier B.1 Datasets The paper uses the following datasets to validate the findings discussed in the main paper: The experiments adopt the following attributes for classification (e.g., Y) and as protected group ( A): ethnicity, age bins, gender. B.2 Architectures, Hyper-parameters, and Settings The study adopts the following architectures to validate the results of the main paper: The model has 11 million trainable parameters. ResNet50 This model contains 48 convolution layers, 1 MaxPool layer and a AvgPool layer. ResNet50 has 25 million trainable parameters. VGG-19 This model consists of 19 layers (16 convolution layers, 3 fully connected layers, 5 MaxPool layers and 1 SoftMax layer).