LLMs Reproduce Stereotypes of Sexual and Gender Minorities

Jan-10-2025–arXiv.org Artificial Intelligence

A large body of research has found substantial gender bias in NLP systems. Most of this research takes a binary, essentialist view of gender: limiting its variation to the categories _men_ and _women_, conflating gender with sex, and ignoring different sexual identities. But gender and sexuality exist on a spectrum, so in this paper we study the biases of large language models (LLMs) towards sexual and gender minorities beyond binary categories. Grounding our study in a widely used psychological framework -- the Stereotype Content Model -- we demonstrate that English-language survey questions about social perceptions elicit more negative stereotypes of sexual and gender minorities from LLMs, just as they do from humans. We then extend this framework to a more realistic use case: text generation. Our analysis shows that LLMs generate stereotyped representations of sexual and gender minorities in this setting, raising concerns about their capacity to amplify representational harms in creative writing, a widely promoted use case.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Jan-10-2025

arXiv.org PDF

Add feedback

Country:
- Asia
  - Middle East > UAE
    - Abu Dhabi Emirate > Abu Dhabi (0.14)
  - Singapore (0.04)
  - Thailand > Bangkok
    - Bangkok (0.04)
- Europe
  - Croatia > Dubrovnik-Neretva County
    - Dubrovnik (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Middle East > Malta
    - Eastern Region > Northern Harbour District > St. Julian's (0.04)
- North America
  - Canada > Ontario
    - Toronto (0.04)
  - Mexico > Mexico City
    - Mexico City (0.04)
  - United States
    - Kansas > Saline County
      - Salina (0.04)
    - Virginia (0.04)

Genre:
- Questionnaire & Opinion Survey (0.88)
- Research Report (0.82)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Large Language Model (1.00)