RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs

Yu, Yue, Ping, Wei, Liu, Zihan, Wang, Boxin, You, Jiaxuan, Zhang, Chao, Shoeybi, Mohammad, Catanzaro, Bryan

Jul-2-2024–arXiv.org Artificial Intelligence

Large language models (LLMs) typically utilize the top-k contexts from a retriever in retrieval-augmented generation (RAG). In this work, we propose a novel instruction fine-tuning framework RankRAG, which instruction-tunes a single LLM for the dual purpose of context ranking and answer generation in RAG. In particular, the instruction-tuned LLMs work surprisingly well by adding a small fraction of ranking data into the training blend, and outperform existing expert ranking models, including the same LLM exclusively fine-tuned on a large amount of ranking data. For generation, we compare our model with many strong baselines, including GPT-4-0613, GPT-4-turbo-2024-0409, and ChatQA-1.5, an open-sourced model with the state-of-the-art performance on RAG benchmarks. Specifically, our Llama3-RankRAG significantly outperforms Llama3-ChatQA-1.5 and GPT-4 models on nine knowledge-intensive benchmarks. In addition, it also performs comparably to GPT-4 on five RAG benchmarks in the biomedical domain without instruction fine-tuning on biomedical data, demonstrating its superb capability for generalization to new domains.

dataset, language model, rankrag, (13 more...)

arXiv.org Artificial Intelligence

Jul-2-2024

arXiv.org PDF

Add feedback

Country:
- Oceania (0.04)
- South America
  - Uruguay (0.04)
  - Argentina (0.04)
  - Brazil (0.04)
- North America
  - United States (0.14)
  - Haiti (0.14)
  - Mexico (0.04)
  - Canada (0.04)
- Europe
  - Germany (0.04)
  - France (0.04)
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - Switzerland > Zürich
    - Zürich (0.14)
  - Italy > Calabria
    - Catanzaro Province > Catanzaro (0.04)
- Africa > Middle East
  - Morocco (0.04)

Genre:
- Research Report > New Finding (0.93)

Industry:
- Education (0.93)
- Media > Music (0.69)
- Health & Medicine (0.66)
- Leisure & Entertainment > Sports
  - Soccer (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found