On Bias and Fairness in NLP: How to have a fairer text classification?
Elsafoury, Fatma, Katsigiannis, Stamos, Ramzan, Naeem
–arXiv.org Artificial Intelligence
After that, to answer RQ1 and to understand the Recent research has shown that natural language impact of upstream bias and its removal on the processing (NLP) models are biased and systematically fairness of the task of text classification ( 5), we discriminate between people based on factors measure upstream bias, remove it and measure its like ethnicity, gender, and others (Nangia et al., impact before and after removal on the fairness of 2020; Elsafoury et al., 2022). The literature suggests the task of text classification. After that, we investigate four main sources of bias that have an impact downstream bias and its impact on the on the fairness of NLP models: Label bias, models' fairness ( 6) to answer RQ2. We then use Representation bias, sample bias, and Overamplification different methods to remove the downstream bias bias (Shah et al., 2020; Hovy and Prabhumoye, ( 7) and investigate the impact of these debiasing 2021). In the NLP literature, these sources methods ( 7.3) on the models' fairness to answer of bias are typically categorized as Upstream bias, RQ3. Then, we analyse our results 7.4 to find out which includes representation bias, and Downstream the most effective bias removal technique to answer bias, which includes Label, Sample and RQ4 and to ensure the fairness of the task of text Overampflication bias.
arXiv.org Artificial Intelligence
May-31-2023
- Country:
- North America > United States
- Washington > King County
- Seattle (0.04)
- New York > New York County
- New York City (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Washington > King County
- Europe
- United Kingdom > Scotland (0.04)
- Italy > Tuscany
- Florence (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Asia
- China > Hong Kong (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- Africa > Ethiopia
- Addis Ababa > Addis Ababa (0.04)
- North America > United States
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Technology: