Reading Smiles: Proxy Bias in Foundation Models for Facial Emotion Recognition
Tsangko, Iosif, Triantafyllopoulos, Andreas, Abdelmoula, Adem, Mallol-Ragolta, Adria, Schuller, Bjoern W.
–arXiv.org Artificial Intelligence
--Foundation Models (FMs) are rapidly transforming Affective Computing (AC), with Vision-Language Models (VLMs) now capable of recognising emotions in zero-shot settings. This paper probes a critical but underexplored question: what visual cues do these models rely on to infer affect, and are these cues psychologically grounded or superficially learnt? We benchmark varying scale VLMs on a teeth-annotated subset of AffectNet dataset and find consistent performance shifts depending on the presence of visible teeth. Through structured introspection of -the best-performing model, i.e., GPT -4o, we show that facial attributes like eyebrow position drive much of its affective reasoning, revealing a high degree of internal consistency in its valence-arousal predictions. These patterns highlight the emergent nature of FMs behaviour, but also reveal risks: shortcut learning, bias, and fairness issues--especially in sensitive domains like mental health and education. Understanding and interpreting human emotions is fundamental to social interaction. From early developmental cues in infants, to high-stakes decision-making in adults, facial expressions serve as a primary channel for conveying affect.
arXiv.org Artificial Intelligence
Dec-1-2025
- Country:
- Asia > China
- Europe
- France > Île-de-France
- Germany > Bavaria
- Upper Bavaria > Munich (0.05)
- United Kingdom > England
- Greater London > London (0.04)
- North America > United States (0.04)
- Genre:
- Research Report (1.00)
- Industry:
- Health & Medicine > Therapeutic Area
- Neurology (0.46)
- Psychiatry/Psychology (0.48)
- Health & Medicine > Therapeutic Area
- Technology:
- Information Technology > Artificial Intelligence
- Cognitive Science > Emotion (1.00)
- Machine Learning > Neural Networks
- Deep Learning (1.00)
- Natural Language > Large Language Model (1.00)
- Vision > Face Recognition (1.00)
- Information Technology > Artificial Intelligence