A Proof of Theorem 3.1
–Neural Information Processing Systems
As proved by Feng et al. (2021), the binary cross-entropy loss We include more results on teacher model and teacher model + {DRO (Hashimoto et al., 2018) /ARL (Lahoti et al., 2020) / FairRF (Zhao et al., 2022) /our knowledge distillation} in Tab. Effect of our label smoothing can be observed by comparing between "Teacher (with hard label)" and "Teacher (with softmax/linear label)" in the 6 tables. Here the capacity is the same, the only difference is the label smoothing. Here the training method is the same, only difference is capacity. Table 8: Results on COMP AS dataset with sensitive attribute race .
Neural Information Processing Systems
Aug-16-2025, 04:21:26 GMT
- Technology: