Toward Mechanistic Explanation of Deductive Reasoning in Language Models

Maltoni, Davide, Ferrara, Matteo

arXiv.org Artificial Intelligence 

Recent large language models have demonstrated relevant capabilities in solving problems that require logical reasoning; however, the corresponding internal mechanisms remain largely unexplored. In this paper, we show that a small language model can solve a deductive reasoning task by learning the underlying rules (rather than operating as a statistical learner). A low-level explanation of its internal representations and computational circuits is then provided. Our findings reveal that induction heads play a central role in the implementation of the rule completion and rule chaining steps involved in the logical inference required by the task. Introduction Recent Large Language Models (LLMs) have demonstrated remarkable capabilities in reasoning and problem-solving (Huang and Chang, 2023). Many approaches have focused on enhancing logical reasoning in LLMs, with a growing body of work introducing formal and symbolic logic-based benchmarks (Liu et al., 2025). While much of the literature emphasizes solving reasoning benchmarks, comparatively less attention has been devoted to understanding and explaining the underlying low-level computational mechanisms. Y et, interpretability is crucial for designing more robust and targeted models, that are less prone to errors.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found