CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation

Cheng, Yiruo, Mao, Kelong, Zhao, Ziliang, Dong, Guanting, Qian, Hongjin, Wu, Yongkang, Sakai, Tetsuya, Wen, Ji-Rong, Dou, Zhicheng

Oct-30-2024–arXiv.org Artificial Intelligence

Retrieval-Augmented Generation (RAG) has become a powerful paradigm for enhancing large language models (LLMs) through external knowledge retrieval. Despite its widespread attention, existing academic research predominantly focuses on single-turn RAG, leaving a significant gap in addressing the complexities of multi-turn conversations found in real-world applications. To bridge this gap, we introduce CORAL, a large-scale benchmark designed to assess RAG systems in realistic multi-turn conversational settings. CORAL includes diverse information-seeking conversations automatically derived from Wikipedia and tackles key challenges such as open-domain coverage, knowledge intensity, free-form responses, and topic shifts. It supports three core tasks of conversational RAG: passage retrieval, response generation, and citation labeling. We propose a unified framework to standardize various conversational RAG methods and conduct a comprehensive evaluation of these methods on CORAL, demonstrating substantial opportunities for improving existing approaches.

computational linguistic, proceedings, qwen2, (15 more...)

arXiv.org Artificial Intelligence

Oct-30-2024

arXiv.org PDF

Add feedback

Country:
- Africa > Cabo Verde (0.04)
- North America
  - Dominican Republic (0.04)
  - United States
    - District of Columbia > Washington (0.04)
    - Texas > Travis County
      - Austin (0.04)
    - Maryland > Montgomery County
      - Gaithersburg (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - California > Los Angeles County
      - Long Beach (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Austria > Vienna (0.14)
  - Germany (0.04)
  - Albania (0.04)
  - Spain > Galicia
    - Madrid (0.04)
- Asia
  - British Indian Ocean Territory > Diego Garcia (0.04)
  - Thailand > Bangkok
    - Bangkok (0.04)
  - Singapore > Central Region
    - Singapore (0.04)
  - Middle East
    - Israel (0.04)
    - UAE > Abu Dhabi Emirate
      - Abu Dhabi (0.04)
  - Japan > Honshū
    - Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
  - China > Beijing
    - Beijing (0.04)

Genre:
- Personal (0.93)
- Research Report (0.64)

Industry:
- Leisure & Entertainment (0.94)
- Automobiles & Trucks (0.68)
- Transportation
  - Ground > Road (0.68)
  - Electric Vehicle (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)