MMSD2.0: Towards a Reliable Multi-modal Sarcasm Detection System

Qin, Libo, Huang, Shijue, Chen, Qiguang, Cai, Chenran, Zhang, Yudi, Liang, Bin, Che, Wanxiang, Xu, Ruifeng

Jul-13-2023–arXiv.org Artificial Intelligence

Multi-modal sarcasm detection has attracted much recent attention. Nevertheless, the existing benchmark (MMSD) has some shortcomings that hinder the development of reliable multi-modal sarcasm detection system: (1) There are some spurious cues in MMSD, leading to the model bias learning; (2) The negative samples in MMSD are not always reasonable. To solve the aforementioned issues, we introduce MMSD2.0, a correction dataset that fixes the shortcomings of MMSD, by removing the spurious cues and re-annotating the unreasonable samples. Meanwhile, we present a novel framework called multi-view CLIP that is capable of leveraging multi-grained cues from multiple perspectives (i.e., text, image, and text-image interaction view) for multi-modal sarcasm detection. Extensive experiments show that MMSD2.0 is a valuable benchmark for building reliable multi-modal sarcasm detection systems and multi-view CLIP can significantly outperform the previous best baselines.

detection, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Jul-13-2023

arXiv.org PDF

Add feedback

Country:
- Asia
  - China (0.29)
  - Middle East > Qatar (0.14)
- Europe (0.68)

Genre:
- Research Report (1.00)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning > Neural Networks
      - Deep Learning (1.00)
    - Natural Language (1.00)
  - Communications > Social Media (1.00)
  - Sensing and Signal Processing > Image Processing (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found