Language Models Get a Gender Makeover: Mitigating Gender Bias with Few-Shot Data Interventions

Thakur, Himanshu, Jain, Atishay, Vaddamanu, Praneetha, Liang, Paul Pu, Morency, Louis-Philippe

Jun-7-2023–arXiv.org Artificial Intelligence

Societal biases present in pre-trained large language models are a critical issue as these models have been shown to propagate biases in countless downstream applications, rendering them unfair towards specific groups of people. Since large-scale retraining of these models from scratch is both time and compute-expensive, a variety of approaches have been previously proposed that de-bias a pre-trained model. While the majority of current state-of-the-art debiasing methods focus on changes to the training regime, in this paper, we propose data intervention strategies as a powerful yet simple technique to reduce gender bias in pre-trained models. Specifically, we empirically show that by fine-tuning a pre-trained model on only 10 de-biased (intervened) training examples, the tendency to favor any gender is significantly reduced. Since our proposed method only needs a few training examples, our few-shot debiasing approach is highly feasible and practical. Through extensive experimentation, we show that our debiasing technique performs better than competitive state-of-the-art baselines with minimal loss in language modeling ability.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

Jun-7-2023

arXiv.org PDF

Add feedback

Country:
- Asia > China
  - Hong Kong (0.04)
- Europe
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
  - France (0.04)
  - Germany > Brandenburg
    - Potsdam (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
- North America
  - Dominican Republic (0.04)
  - United States
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Pennsylvania > Allegheny County
      - Pittsburgh (0.04)
    - Washington > King County
      - Seattle (0.04)

Genre:
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Inductive Learning (0.95)
  - Natural Language > Large Language Model (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found