Multi-task retriever fine-tuning for domain-specific and efficient RAG
Béchard, Patrice, Ayala, Orlando Marquez
–arXiv.org Artificial Intelligence
Retrieval-Augmented Generation (RAG) has become ubiquitous when deploying Large Language Models (LLMs), as it can address typical limitations such as generating hallucinated or outdated information. However, when building real-world RAG applications, practical issues arise. First, the retrieved information is generally domain-specific. Since it is computationally expensive to fine-tune LLMs, it is more feasible to fine-tune the retriever to improve the quality of the data included in the LLM input. Second, as more applications are deployed in the same real-world system, one cannot afford to deploy separate retrievers. Moreover, these RAG applications normally retrieve different kinds of data. Our solution is to instruction fine-tune a small retriever encoder on a variety of domain-specific tasks to allow us to deploy one encoder that can serve many use cases, thereby achieving low-cost, scalability, and speed. We show how this encoder generalizes to out-of-domain settings as well as to an unseen retrieval task on real-world enterprise use cases.
arXiv.org Artificial Intelligence
Jan-8-2025
- Country:
- Europe > Monaco (0.04)
- North America
- United States
- New York (0.04)
- Florida > Miami-Dade County
- Miami (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- Canada > Ontario
- Toronto (0.04)
- United States
- Asia
- Middle East > Jordan (0.04)
- China > Hong Kong (0.04)
- Myanmar > Tanintharyi Region
- Dawei (0.04)
- Genre:
- Research Report (0.82)
- Workflow (0.72)
- Technology: