The power of Prompts: Evaluating and Mitigating Gender Bias in MT with LLMs

Sant, Aleix, Escolano, Carlos, Mash, Audrey, Fornaciari, Francesca De Luca, Melero, Maite

Jul-26-2024–arXiv.org Artificial Intelligence

This paper studies gender bias in machine translation through the lens of Large Language Models (LLMs). Four widely-used test sets are employed to benchmark various base LLMs, comparing their translation quality and gender bias against state-of-the-art Neural Machine Translation (NMT) models for English to Catalan (En $\rightarrow$ Ca) and English to Spanish (En $\rightarrow$ Es) translation directions. Our findings reveal pervasive gender bias across all models, with base LLMs exhibiting a higher degree of bias compared to NMT models. To combat this bias, we explore prompting engineering techniques applied to an instruction-tuned LLM. We identify a prompt structure that significantly reduces gender bias by up to 12% on the WinoMT evaluation dataset compared to more straightforward prompts. These results significantly reduce the gender bias accuracy gap between LLMs and traditional NMT systems.

catalan, human entity, pronoun, (14 more...)

arXiv.org Artificial Intelligence

Jul-26-2024

arXiv.org PDF

Add feedback

Country:
- Africa > Southern Africa (0.04)
- South America > Argentina
  - Pampas
    - Buenos Aires F.D. > Buenos Aires (0.05)
    - Buenos Aires Province (0.04)
- North America
  - Dominican Republic (0.04)
  - United States
    - Pennsylvania (0.04)
    - New York (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - Hawaii > Honolulu County
      - Honolulu (0.04)
    - California
      - San Diego County > San Diego (0.04)
      - Los Angeles County > Long Beach (0.04)
  - Canada
    - Ontario > Toronto (0.04)
    - Quebec > Montreal (0.04)
- Europe
  - Spain > Catalonia (0.04)
  - Portugal > Lisbon
    - Lisbon (0.04)
  - Poland > Lesser Poland Province
    - Kraków (0.04)
  - Middle East > Malta
    - Eastern Region > Northern Harbour District > St. Julian's (0.04)
  - Italy
    - Tuscany > Florence (0.04)
    - Lazio > Rome (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Finland > Pirkanmaa
    - Tampere (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia
  - Singapore (0.04)
  - China > Hong Kong (0.04)
  - Middle East
    - UAE (0.04)
    - Qatar > Ad-Dawhah
      - Doha (0.04)

Genre:
- Research Report > New Finding (0.34)

Industry:
- Government (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Machine Translation (1.00)
    - Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.95)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found