Retrieval-augmented reasoning with lean language models
Chan, Ryan Sze-Yin, Nanni, Federico, Lazauskas, Tomas, Wood, Rosie, Yong, Penelope, Tarassenko, Lionel, Girolami, Mark, Geddes, James, Duncan, Andrew
–arXiv.org Artificial Intelligence
This technical report details a novel approach to combining reasoning and retrieval augmented generation (RAG) within a single, lean language model architecture. While existing RAG systems typically rely on large-scale models and external APIs, our work addresses the increasing demand for performant and privacy-preserving solutions deployable in resource-constrained or secure environments. Building on recent developments in test-time scaling and small-scale reasoning models, we develop a retrieval augmented conversational agent capable of interpreting complex, domain-specific queries using a lightweight backbone model. Our system integrates a dense retriever with fine-tuned Qwen2.5-Instruct models, using synthetic query generation and reasoning traces derived from frontier models (e.g., DeepSeek-R1) over a curated corpus, in this case, the NHS A-to-Z condition pages. We explore the impact of summarisation-based document compression, synthetic data design, and reasoning-aware fine-tuning on model performance. Evaluation against both non-reasoning and general-purpose lean models demonstrates that our domain-specific fine-tuning approach yields substantial gains in answer accuracy and consistency, approaching frontier-level performance while remaining feasible for local deployment. All implementation details and code are publicly released to support reproducibility and adaptation across domains.
arXiv.org Artificial Intelligence
Aug-18-2025
- Country:
- Asia
- China (0.04)
- Middle East > Jordan (0.04)
- Myanmar > Tanintharyi Region
- Dawei (0.04)
- Europe > United Kingdom
- England
- Cambridgeshire > Cambridge (0.04)
- Oxfordshire > Oxford (0.04)
- England
- North America > United States
- New Mexico > Bernalillo County
- Albuquerque (0.04)
- New York > New York County
- New York City (0.04)
- New Mexico > Bernalillo County
- Asia
- Genre:
- Overview (1.00)
- Research Report (0.84)
- Industry:
- Technology: