Cross-lingual Transfer Can Worsen Bias in Sentiment Analysis

Goldfarb-Tarrant, Seraphina, Ross, Björn, Lopez, Adam

May-22-2023–arXiv.org Artificial Intelligence

Sentiment analysis (SA) systems are widely deployed in many of the world's languages, and there is well-documented evidence of demographic bias in these systems. In languages beyond English, scarcer training data is often supplemented with transfer learning using pre-trained models, including multilingual models trained on other languages. In some cases, even supervision data comes from other languages. Does cross-lingual transfer also import new biases? To answer this question, we use counterfactual evaluation to test whether gender or racial biases are imported when using cross-lingual transfer, compared to a monolingual transfer setting. Across five languages, we find that systems using cross-lingual transfer usually become more biased than their monolingual counterparts. We also find racial biases to be much more prevalent than gender biases. To spur further research on this topic, we release the sentiment models we used for this study, and the intermediate checkpoints throughout training, yielding 1,525 distinct models; we also release our evaluation code.

computational linguistic, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

May-22-2023

arXiv.org PDF

Add feedback

Country:
- South America > Chile
  - Maule Region > Talca Province > Talca (0.04)
- North America > United States
  - Nevada (0.04)
  - New York
    - New York County > New York City (0.14)
    - Richmond County > New York City (0.04)
    - Queens County > New York City (0.04)
    - Kings County > New York City (0.04)
    - Bronx County > New York City (0.04)
  - Louisiana > Orleans Parish
    - New Orleans (0.04)
- Europe
  - Spain (0.04)
  - Germany (0.04)
  - Italy > Tuscany
    - Florence (0.04)
- Asia
  - China > Hong Kong (0.04)
  - Japan (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Government (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Natural Language
    - Information Extraction (0.71)
    - Discourse & Dialogue (0.71)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found