The Bias Amplification Paradox in Text-to-Image Generation

Seshadri, Preethi, Singh, Sameer, Elazar, Yanai

arXiv.org Artificial Intelligence 

Bias amplification is a phenomenon in which models exacerbate biases or stereotypes present in the training data. In this paper, we study bias amplification in the text-to-image domain using Stable Diffusion by comparing gender ratios in training vs. generated images. We find that the model appears to amplify gender-occupation biases found in the training data (LAION) considerably. However, we discover that amplification can be largely attributed to discrepancies between training captions and model prompts. For example, an inherent difference is that captions from the training data often contain explicit gender information while our prompts do not, which leads to a distribution shift and consequently inflates bias measures. Once we account for distributional differences between texts used for training and generation when Figure 1: Comparing generated and training images for evaluating amplification, we observe that amplification engineer, the model clearly seems to amplify bias by decreases drastically. Our findings going from 25% to 10% female in training vs. generated illustrate the challenges of comparing biases in images. However, when looking at the subset of training models and their training data, and highlight examples without gender indicators in captions (similar confounding factors that impact analyses.