Anatomy of a Feeling: Narrating Embodied Emotions via Large Vision-Language Models

Saim, Mohammad, Duong, Phan Anh, Luong, Cat, Bhanderi, Aniket, Jiang, Tianyu

Sep-25-2025–arXiv.org Artificial Intelligence

The embodiment of emotional reactions from body parts contains rich information about our affective experiences. We propose a framework that utilizes state-of-the-art large vision-language models (LVLMs) to generate Embodied LVLM Emotion Narratives (ELENA). These are well-defined, multi-layered text outputs, primarily comprising descriptions that focus on the salient body parts involved in emotional reactions. We also employ attention maps and observe that contemporary models exhibit a persistent bias towards the facial region. Despite this limitation, we observe that our employed framework can effectively recognize embodied emotions in face-masked images, outperforming baselines without any fine-tuning. ELENA opens a new trajectory for embodied emotion analysis across the modality of vision and enriches modeling in an affect-aware setting.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Sep-25-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report > New Finding (0.68)

Industry:
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Cognitive Science > Emotion (1.00)
  - Natural Language > Large Language Model (0.72)
  - Machine Learning > Neural Networks
    - Deep Learning (0.71)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found