LogiNumSynth: Synthesizing Joint Logical-Numerical Reasoning Problems for Language Models

Liu, Yiwei, Li, Yucheng, Li, Xiao, Cheng, Gong

Oct-14-2025–arXiv.org Artificial Intelligence

Joint logical-numerical reasoning remains a major challenge for language models, yet existing datasets rely on fixed rule sets and offer limited control over task complexity, constraining their generalizability for evaluation and training. We present LogiNumSynth, a flexible natural language problem synthesizer that synthesizes tasks requiring proficiency in joint logical reasoning (e.g., rule-based reasoning) and numerical reasoning (e.g., arithmetic computation). LogiNumSynth supports fine-grained control over reasoning world richness, logical reasoning depth, and the complexity of numerical computations, enabling flexible data synthesis across difficulty levels. We demonstrate three key contributions: (1) Synthesizer -- synthesizing fully controllable joint reasoning tasks over natural language; (2) Evaluation & Process Analysis -- evaluating both process accuracy and answer accuracy; (3) Targeted Training -- using synthesized data to enhance LLMs' reasoning performance. Experiments with multiple LLMs highlight persistent weaknesses in logical-numerical reasoning, showing that LogiNumSynth can serve as both a diagnostic tool and a source of targeted supervision for advancing integrated reasoning skills.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Oct-14-2025

arXiv.org PDF

Add feedback

Country:
- Asia (0.46)
- Europe > Austria (0.28)

Genre:
- Research Report > New Finding (0.67)

Industry:
- Education (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Rule-Based Reasoning (1.00)
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (0.96)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found