Blending Learning to Rank and Dense Representations for Efficient and Effective Cascades
Nardini, Franco Maria, Perego, Raffaele, Tonellotto, Nicola, Trani, Salvatore
–arXiv.org Artificial Intelligence
We investigate the exploitation of both lexical and neural relevance signals for ad-hoc passage retrieval. Our exploration involves a large-scale training dataset in which dense neural representations of MS-MARCO queries and passages are complemented and integrated with 253 hand-crafted lexical features extracted from the same corpus. Blending of the relevance signals from the two different groups of features is learned by a classical Learning-to-Rank (LTR) model based on a forest of decision trees. To evaluate our solution, we employ a pipelined architecture where a dense neural retriever serves as the first stage and performs a nearest-neighbor search over the neural representations of the documents. Our LTR model acts instead as the second stage that re-ranks the set of candidates retrieved by the first stage to enhance effectiveness. The results of reproducible experiments conducted with state-of-the-art dense retrievers on publicly available resources show that the proposed solution significantly enhances the end-to-end ranking performance while relatively minimally impacting efficiency. Specifically, we achieve a boost in nDCG@10 of up to 11% with an increase in average query latency of only 4.3%. This confirms the advantage of seamlessly combining two distinct families of signals that mutually contribute to retrieval effectiveness.
arXiv.org Artificial Intelligence
Oct-21-2025
- Country:
- Europe
- Austria (0.04)
- Italy > Tuscany
- Pisa Province > Pisa (0.05)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Switzerland (0.04)
- North America
- Canada (0.05)
- United States
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New York > New York County
- New York City (0.05)
- Texas > Travis County
- Austin (0.04)
- Minnesota > Hennepin County
- Europe
- Genre:
- Research Report (0.51)
- Technology: