Examining Temporal Bias in Abusive Language Detection
Jin, Mali, Mu, Yida, Maynard, Diana, Bontcheva, Kalina
–arXiv.org Artificial Intelligence
Previous work identified temporal bias in an Italian hate In recent years, researchers have developed a huge variety speech data set associated with immigrants (Florio et al. of machine learning models that can automatically detect 2020). However, they have yet to explore temporal factors abusive language (Mishra et al. 2019; Aurpa, Sadik, and affecting predictive performance from a multilingual perspective. Ahmed 2022; Das and Mukherjee 2023; Alrashidi, Jamal, In this paper, we explore temporal bias in 5 different and Alkhathlan 2023). However, these models may be subject abusive data sets that span varying time periods, in 4 to temporal bias, which can lead to a decrease in the languages (English, Spanish, Italian, and Chinese). Specifically, accuracy of abusive language detection models, potentially we investigate the following core research questions: allowing abusive language to be undetected or falsely detected. RQ1: How does the magnitude of temporal bias vary across different data sets such as language, time span and Temporal bias arises from differences in populations and collection methods?
arXiv.org Artificial Intelligence
Sep-25-2023
- Country:
- Asia > Middle East (0.46)
- North America (0.28)
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Industry:
- Technology: