Do Theory of Mind Benchmarks Need Explicit Human-like Reasoning in Language Models?

Open in new window