Text Is Not All You Need: Multimodal Prompting Helps LLMs Understand Humor

Dec-1-2024–arXiv.org Artificial Intelligence

While Large Language Models (LLMs) have demonstrated impressive natural language understanding capabilities across various text-based tasks, understanding humor has remained a persistent challenge. Humor is frequently multimodal, relying on phonetic ambiguity, rhythm and timing to convey meaning. In this study, we explore a simple multimodal prompting approach to humor understanding and explanation. We present an LLM with both the text and the spoken form of a joke, generated using an off-the-shelf text-to-speech (TTS) system. Using multimodal cues improves the explanations of humor compared to textual prompts across all tested datasets.

explanation, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

Dec-1-2024

arXiv.org PDF

Add feedback

Country:
- Asia
  - Middle East > UAE (0.14)
  - Thailand (0.14)

Genre:
- Research Report > New Finding (0.88)

Industry:
- Leisure & Entertainment (0.70)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)