To be Robust or to be Fair: Towards Fairness in Adversarial Training
Xu, Han, Liu, Xiaorui, Li, Yaxin, Tang, Jiliang
Adversarial training algorithms have been proven to be reliable to improve machine learning models' robustness against adversarial examples. However, we find that adversarial training algorithms tend to introduce severe disparity of accuracy and robustness between different groups of data. This phenomenon happens in balanced datasets and does not exist in naturally trained models when only using clean samples. In this work, we theoretically show that this phenomenon can generally happen under adversarial training algorithms which minimize DNN models' robust errors. Motivated by these findings, we propose a Fair-Robust-Learning (FRL) framework to mitigate this unfairness problem when doing adversarial defenses and experimental results validate the effectiveness of FRL. The existence of adversarial examples (Goodfellow et al., 2014; Szegedy et al., 2013) causes huge concerns when applying deep neural networks on safety-critical tasks, such as autonomous driving vehicles and face identification. These adversarial examples are artificially crafted samples.
Oct-12-2020
- Country:
- North America > United States
- Michigan (0.04)
- Asia > Middle East
- Jordan (0.04)
- North America > United States
- Genre:
- Research Report (0.64)
- Industry:
- Information Technology (0.35)
- Automobiles & Trucks (0.34)
- Transportation > Ground
- Road (0.34)
- Technology: