Toward Mechanistic Explanation of Deductive Reasoning in Language Models

Oct-13-2025–arXiv.org Artificial Intelligence

Recent large language models have demonstrated relevant capabilities in solving problems that require logical reasoning; however, the corresponding internal mechanisms remain largely unexplored. In this paper, we show that a small language model can solve a deductive reasoning task by learning the underlying rules (rather than operating as a statistical learner). A low-level explanation of its internal representations and computational circuits is then provided. Our findings reveal that induction heads play a central role in the implementation of the rule completion and rule chaining steps involved in the logical inference required by the task. Introduction Recent Large Language Models (LLMs) have demonstrated remarkable capabilities in reasoning and problem-solving (Huang and Chang, 2023). Many approaches have focused on enhancing logical reasoning in LLMs, with a growing body of work introducing formal and symbolic logic-based benchmarks (Liu et al., 2025). While much of the literature emphasizes solving reasoning benchmarks, comparatively less attention has been devoted to understanding and explaining the underlying low-level computational mechanisms. Y et, interpretability is crucial for designing more robust and targeted models, that are less prone to errors.

large language model, natural language, residual stream, (16 more...)

arXiv.org Artificial Intelligence

Oct-13-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report > New Finding (1.00)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found