ORION: Teaching Language Models to Reason Efficiently in the Language of Thought

Tanmay, Kumar, Aggarwal, Kriti, Liang, Paul Pu, Mukherjee, Subhabrata

arXiv.org Artificial Intelligence 

Large Reasoning Models (LRMs) achieve state-of-the-art performance in mathematics, code generation, and task planning. Inspired by the Language of Thought Hypothesis --which posits that human reasoning operates over a symbolic, compositional mental language called Mentalese--we introduce a cognitively motivated framework that trains models to reason in a similar compact style. Mentalese encodes abstract reasoning as ultra-compressed, structured tokens, enabling models to solve complex problems with far fewer steps. When applied to Mentalese-aligned models, SLPO achieves much larger compression rates by enabling compressed reasoning that preserves the benefits of detailed thinking without the computational overhead, allowing us to present the best-performing models at each compression level along the performance-efficiency Pareto frontier. Across mathematical benchmarks -- including AIME 2024 & 2025, Minerva-Math, OlympiadBench, Math500, and AMC -- our ORION models generate reasoning traces with 4-16 fewer tokens, achieve up to 5 lower inference latency, and reduce training costs by 7-9 relative to the base DeepSeek R1 Distilled model, while maintaining 90-98% of the baseline accuracy. ORION models also surpass Claude and ChatGPT -4o by up to 5% in accuracy while maintaining 2 compression. Our findings demonstrate Mentalese-style compressed reasoning offers a breakthrough toward human-like cognitive efficiency, opening new possibilities for real-time, cost-effective reasoning without sacrificing accuracy. The dotted curve indicates the Pareto frontier, which illustrates the trade-off between higher compression rates and loss in accuracy. Our proposed method, combining Mentalese alignment with SLPO, consistently lies on this frontier, identifying an optimal operating point that achieves a balance between accuracy and efficiency. Work done during internship at Hippocratic AI. Recent advances such as OpenAI o1 (OpenAI et al., 2024b) and DeepSeek R1 (DeepSeek-AI et al., 2025) have reshaped how we think about language model reasoning. By letting models "think before they answer," these systems dramatically improved credibility and performance--achievements that were once thought impossible for LLMs (Wu et al., 2024). Explicit reasoning has thus emerged as a central focus of LLM research (Xu et al., 2025).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found