Clustering Discourses: Racial Biases in Short Stories about Women Generated by Large Language Models

Bonil, Gustavo, Gondim, João, Santos, Marina dos, Hashiguti, Simone, Maia, Helena, Silva, Nadia, Pedrini, Helio, Avila, Sandra

Sep-4-2025–arXiv.org Artificial Intelligence

This study investigates how large language models, in particular LLaMA 3.2-3B, construct narratives about Black and white women in short stories generated in Portuguese. From 2100 texts, we applied computational methods to group semantically similar stories, allowing a selection for qualitative analysis. Three main discursive representations emerge: social overcoming, ancestral mythification and subjective self-realization. The analysis uncovers how grammatically coherent, seemingly neutral texts materialize a crystallized, colo-nially structured framing of the female body, reinforcing historical inequalities. The study proposes an integrated approach, that combines machine learning techniques with qualitative, manual discourse analysis.

large language model, machine learning, short story, (20 more...)

arXiv.org Artificial Intelligence

Sep-4-2025

arXiv.org PDF

Add feedback

Country:
- South America > Brazil (0.28)

Genre:
- Research Report > New Finding (0.68)

Industry:
- Law > Civil Rights & Constitutional Law (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.49)