CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning
He, Jie, Bai, Richard He, Williamson, Sinead, Pan, Jeff Z., Jaitly, Navdeep, Zhang, Yizhe
–arXiv.org Artificial Intelligence
Retrieval-augmented generation (RAG) enhances large language models (LLMs) with external knowledge but still suffers from long contexts and disjoint retrieval-generation optimization. In this work, we propose CLaRa (Continuous Latent Reasoning), a unified framework that performs embedding-based compression and joint optimization in a shared continuous space. To obtain semantically rich and retrievable compressed vectors, we introduce SCP, a key-preserving data synthesis framework using QA and paraphrase supervision. CLaRa then trains the reranker and generator end-to-end via a single language modeling loss, with gradients flowing through both modules using a differentiable top-k estimator. Theoretically, this unified optimization aligns retrieval relevance with answer quality. Experiments across multiple QA benchmarks show that CLaRa achieves state-of-the-art compression and reranking performance, often surpassing text-based fine-tuned baselines.
arXiv.org Artificial Intelligence
Nov-27-2025
- Country:
- Asia
- Europe
- Austria > Vienna (0.14)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Italy > Calabria
- Catanzaro Province > Catanzaro (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- North America
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- Florida > Miami-Dade County
- Miami (0.04)
- Kentucky (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Minnesota (0.04)
- New York > New York County
- New York City (0.04)
- North Carolina (0.04)
- Oklahoma (0.04)
- Rhode Island > Newport County (0.04)
- Florida > Miami-Dade County
- Mexico > Mexico City
- Genre:
- Research Report > New Finding (0.92)
- Industry:
- Leisure & Entertainment > Sports > Football (1.00)
- Technology: