CRAG - Comprehensive RAG Benchmark Xiao Yang
–Neural Information Processing Systems
Retrieval-Augmented Generation (RAG) has recently emerged as a promising solution to alleviate Large Language Model (LLM)'s deficiency in lack of knowledge. Existing RAG datasets, however, do not adequately represent the diverse and dynamic nature of real-world Question Answering (QA) tasks. To bridge this gap, we introduce the Comprehensive RAG Benchmark (CRAG), a factual question answering benchmark of 4,409 question-answer pairs and mock APIs to simulate web and Knowledge Graph (KG) search. CRAG is designed to encapsulate a diverse array of questions across five domains and eight question categories, reflecting varied entity popularity from popular to long-tail, and temporal dynamisms ranging from years to seconds. Our evaluation of this benchmark highlights the gap to fully trustworthy QA.
Neural Information Processing Systems
May-28-2025, 12:56:26 GMT
- Country:
- Asia > China > Guangdong Province (0.28)
- Genre:
- Research Report > Promising Solution (0.34)
- Industry:
- Leisure & Entertainment (1.00)
- Media
- Technology: