DEL-ToM: Inference-Time Scaling for Theory-of-Mind Reasoning via Dynamic Epistemic Logic
Wu, Yuheng, Xie, Jianwen, Zhang, Denghui, Xu, Zhaozhuo
–arXiv.org Artificial Intelligence
Theory-of-Mind (ToM) tasks pose a unique challenge for large language models (LLMs), which often lack the capability for dynamic logical reasoning. In this work, we propose DEL-ToM, a framework that improves verifiable ToM reasoning through inference-time scaling rather than architectural changes. Our approach decomposes ToM tasks into a sequence of belief updates grounded in Dynamic Epistemic Logic (DEL), enabling structured and verifiable dynamic logical reasoning. We use data generated automatically via a DEL simulator to train a verifier, which we call the Process Belief Model (PBM), to score each belief update step. During inference, the PBM evaluates candidate belief traces from the LLM and selects the highest-scoring one. This allows LLMs to allocate extra inference-time compute to yield more transparent reasoning. Experiments across model scales and benchmarks show that DEL-ToM consistently improves performance, demonstrating that verifiable belief supervision significantly enhances LLMs' ToM capabilities without retraining. Code is available at https://github.com/joel-wu/DEL-ToM.
arXiv.org Artificial Intelligence
Sep-30-2025
- Country:
- Asia
- Middle East > Jordan (0.04)
- Singapore (0.04)
- Europe
- Netherlands > South Holland
- Dordrecht (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Netherlands > South Holland
- North America > United States
- California > Santa Clara County > Palo Alto (0.04)
- Asia
- Genre:
- Research Report (0.82)
- Technology: