Evaluating Gender Bias of Pre-trained Language Models in Natural Language Inference by Considering All Labels
Anantaprayoon, Panatchakorn, Kaneko, Masahiro, Okazaki, Naoaki
–arXiv.org Artificial Intelligence
Discriminatory social biases, including gender biases, have been found in Pre-trained Language Models (PLMs). In Natural Language Inference (NLI), recent bias evaluation methods have observed biased inferences from the outputs of a particular label such as neutral or entailment. However, since different biased inferences can be associated with different output labels, it is inaccurate for a method to rely on one label. In this work, we propose an evaluation method that considers all labels in the NLI task. We create evaluation data and assign them into groups based on their expected biased output labels. Then, we define a bias measure based on the corresponding label output of each data group. In the experiment, we propose a meta-evaluation method for NLI bias measures, and then use it to confirm that our measure can evaluate bias more accurately than the baseline. Moreover, we show that our evaluation method is applicable to multiple languages by conducting the meta-evaluation on PLMs in three different languages: English, Japanese, and Chinese. Finally, we evaluate PLMs of each language to confirm their bias tendency. To our knowledge, we are the first to build evaluation datasets and measure the bias of PLMs from the NLI task in Japanese and Chinese.
arXiv.org Artificial Intelligence
Sep-18-2023
- Country:
- Asia > Japan
- Honshū
- Kantō > Tokyo Metropolis Prefecture
- Tokyo (0.04)
- Tōhoku (0.05)
- Kantō > Tokyo Metropolis Prefecture
- Honshū
- Europe
- North America > United States
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New York > New York County
- New York City (0.04)
- Washington > King County
- Seattle (0.04)
- Louisiana > Orleans Parish
- Asia > Japan
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Leisure & Entertainment > Sports (0.35)
- Technology: