Fair Enough: Standardizing Evaluation and Model Selection for Fairness Research in NLP

Han, Xudong, Baldwin, Timothy, Cohn, Trevor

Feb-11-2023–arXiv.org Artificial Intelligence

Modern NLP systems exhibit a range of biases, which a growing literature on model debiasing attempts to correct. However current progress is hampered by a plurality of definitions of bias, means of quantification, and oftentimes vague relation between debiasing algorithms and theoretical measures of bias. This paper seeks to clarify the current situation and plot a course for meaningful progress in fair learning, with two key contributions: (1) making clear inter-relations among the current gamut of methods, and their relation to fairness theory; and (2) addressing the practical problem of model selection, which involves a trade-off between fairness and accuracy and has led to systemic issues in fairness research. Putting them together, we make several recommendations to help shape future work.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Feb-11-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - Dominican Republic (0.04)
  - United States
    - Washington > King County
      - Seattle (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.04)
- Asia
  - China > Hong Kong (0.04)
  - South Korea > Seoul
    - Seoul (0.04)
  - Middle East > UAE
    - Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Representation & Reasoning > Optimization (0.46)
  - Machine Learning
    - Performance Analysis > Accuracy (0.69)
    - Statistical Learning (0.62)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found