Debiasing Convolutional Neural Networks via Meta Orthogonalization
David, Kurtis Evan, Liu, Qiang, Fong, Ruth
–arXiv.org Artificial Intelligence
While deep learning models often achieve strong task performance, their successes are hampered by their inability to disentangle spurious correlations from causative factors, such as when they use protected attributes (e.g., race, gender, etc.) to make decisions. In this work, we tackle the problem of debiasing convolutional neural networks (CNNs) in such instances. Building off of existing work on debiasing word embeddings and model interpretability, our Meta Orthogonalization method encourages the CNN representations of different concepts (e.g., gender and class labels) to be orthogonal to one another in activation space while maintaining strong downstream task performance. Through a variety of experiments, we systematically test our method and demonstrate that it significantly mitigates model bias and is competitive against current adversarial debiasing methods.
arXiv.org Artificial Intelligence
Nov-15-2020
- Country:
- North America > United States
- New York > New York County > New York City (0.04)
- Europe
- United Kingdom > England
- Oxfordshire > Oxford (0.04)
- Sweden > Stockholm
- Stockholm (0.04)
- Italy > Tuscany
- Florence (0.04)
- United Kingdom > England
- North America > United States
- Genre:
- Research Report (0.64)
- Technology: