Distilling Tool Knowledge into Language Models via Back-Translated Traces

Huang, Xingyue, Hu, Xianglong, Ding, Zifeng, He, Yuan, Rishabh, null, Alzarooni, Waleed, Ye, Ziyu, Fan, Wendong, He, Bailan, Bo, Haige, Hu, Changran, Li, Guohao

Jun-25-2025–arXiv.org Artificial Intelligence

Large language models (LLMs) often struggle with mathematical problems that require exact computation or multi-step algebraic reasoning. Tool-integrated reasoning (TIR) offers a promising solution by leveraging external tools such as code interpreters to ensure correctness, but it introduces inference-time dependencies that hinder scalability and deployment. In this work, we propose a new paradigm for distilling tool knowledge into LLMs purely through natural language. We first construct a Solver Agent that solves math problems by interleaving planning, symbolic tool calls, and reflective reasoning. Then, using a back-translation pipeline powered by multiple LLM-based agents, we convert interleaved TIR traces into natural language reasoning traces. A Translator Agent generates explanations for individual tool calls, while a Rephrase Agent merges them into a fluent and globally coherent narrative. Empirically, we show that fine-tuning a small open-source model on these synthesized traces enables it to internalize both tool knowledge and structured reasoning patterns, yielding gains on competition-level math benchmarks without requiring tool access at inference.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

Jun-25-2025

arXiv.org PDF

Add feedback

Country:
- Europe (0.93)
- North America > United States (0.92)

Genre:
- Workflow (0.96)
- Research Report > Promising Solution (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Cognitive Science > Problem Solving (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found