Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities

Wilf, Alex, Lee, Sihyun Shawn, Liang, Paul Pu, Morency, Louis-Philippe

Nov-16-2023–arXiv.org Artificial Intelligence

Human interactions are deeply rooted in the interplay of thoughts, beliefs, and desires made possible by Theory of Mind (ToM): our cognitive ability to understand the mental states of ourselves and others. Although ToM may come naturally to us, emulating it presents a challenge to even the most advanced Large Language Models (LLMs). Recent improvements to LLMs' reasoning capabilities from simple yet effective prompting techniques such as Chain-of-Thought have seen limited applicability to ToM. In this paper, we turn to the prominent cognitive science theory "Simulation Theory" to bridge this gap. We introduce SimToM, a novel two-stage prompting framework inspired by Simulation Theory's notion of perspective-taking. To implement this idea on current ToM benchmarks, SimToM first filters context based on what the character in question knows before answering a question about their mental state. Our approach, which requires no additional training and minimal prompt-tuning, shows substantial improvement over existing methods, and our analysis reveals the importance of perspective-taking to Theory-of-Mind capabilities. Our findings suggest perspective-taking as a promising direction for future research into improving LLMs' ToM capabilities.

information, llm, reasoning, (15 more...)

arXiv.org Artificial Intelligence

Nov-16-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - New York (0.04)
    - Pennsylvania > Allegheny County
      - Pittsburgh (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe > United Kingdom
  - England
    - Oxfordshire > Oxford (0.05)
    - Cambridgeshire > Cambridge (0.04)
- Asia > China
  - Hong Kong (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Health & Medicine > Therapeutic Area > Neurology (0.88)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.52)