Representation Decomposition for Learning Similarity and Contrastness Across Modalities for Affective Computing

Tian, Yuanhe, Cheng, Pengsen, Jin, Guoqing, Zhang, Lei, Song, Yan

Jun-10-2025–arXiv.org Artificial Intelligence

Multi-modal affective computing aims to automatically recognize and interpret human attitudes from diverse data sources such as images and text, thereby enhancing human-computer interaction and emotion understanding. Existing approaches typically rely on unimodal analysis or straightforward fusion of cross-modal information that fail to capture complex and conflicting evidence presented across different modalities. In this paper, we propose a novel LLM-based approach for affective computing that explicitly deconstructs visual and textual representations into shared (modality-invariant) and modality-specific components. Specifically, our approach firstly encodes and aligns input modalities using pre-trained multi-modal encoders, then employs a representation decomposition framework to separate common emotional content from unique cues, and finally integrates these decomposed signals via an attention mechanism to form a dynamic soft prompt for a multi-modal LLM. Extensive experiments on three representative tasks for affective computing, namely, multi-modal aspect-based sentiment analysis, multi-modal emotion analysis, and hateful meme detection, demonstrate the effectiveness of our approach, which consistently outperforms strong baselines and state-of-the-art models.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Jun-10-2025

arXiv.org PDF

Add feedback

Country:
- Asia
  - China (0.04)
  - India (0.04)
  - Myanmar > Tanintharyi Region
    - Dawei (0.04)
  - Thailand > Bangkok
    - Bangkok (0.04)
- North America
  - Canada > Ontario
    - Toronto (0.04)
  - Dominican Republic (0.04)
  - United States
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Washington > King County
      - Seattle (0.04)
- Oceania > Australia
  - Victoria > Melbourne (0.04)

Genre:
- Research Report
  - New Finding (0.46)
  - Promising Solution (0.48)

Technology:
- Information Technology > Artificial Intelligence
  - Cognitive Science > Emotion (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.93)
  - Natural Language > Large Language Model (0.90)