AITopics | crag

CRAG - Comprehensive RAG Benchmark

Neural Information Processing SystemsMar-18-2026, 07:54:41 GMT

Retrieval-Augmented Generation (RAG) has recently emerged as a promising solution to alleviate Large Language Model (LLM)'s deficiency in lack of knowledge. Existing RAG datasets, however, do not adequately represent the diverse and dynamic nature of real-world Question Answering (QA) tasks. To bridge this gap, we introduce the Comprehensive RAG Benchmark (CRAG), a factual question answering benchmark of 4,409 question-answer pairs and mock APIs to simulate web and Knowledge Graph (KG) search. CRAG is designed to encapsulate a diverse array of questions across five domains and eight question categories, reflecting varied entity popularity from popular to long-tail, and temporal dynamisms ranging from years to seconds. Our evaluation on this benchmark highlights the gap to fully trustworthy QA.

artificial intelligence, large language model, natural language, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)

Add feedback

1435d2d0fca85a84d83ddcb754f58c29-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsFeb-8-2026, 06:35:39 GMT

benchmark, information, truthfulness, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Leisure & Entertainment (1.00)
Media > Music (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Supplemental Materials

Neural Information Processing SystemsNov-13-2025, 23:42:34 GMT

We bear all responsibility in case of violation of rights, etc., and confirmation of the data license. This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International This license permits sharing and adapting the work provided it is not used for commercial purposes and appropriate credit is given. Please refer to Section 3 for our hosting plan. In this section, we use the framework of Datasheets for Datasets [? ] to form a datasheet for CRAG, For what purpose was the dataset created? Was there a specific task in mind?

artificial intelligence, crag, dataset, (18 more...)

Neural Information Processing Systems

Industry:

Information Technology (0.90)
Law (0.70)

Technology: Information Technology > Artificial Intelligence (0.70)

Add feedback

Supplemental Materials

Neural Information Processing SystemsOct-9-2025, 19:02:32 GMT

We bear all responsibility in case of violation of rights, etc., and confirmation of the data license. This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International This license permits sharing and adapting the work provided it is not used for commercial purposes and appropriate credit is given. Please refer to Section 3 for our hosting plan. In this section, we use the framework of Datasheets for Datasets [? ] to form a datasheet for CRAG, For what purpose was the dataset created? Was there a specific task in mind?

crag, dataset, timeframe, (17 more...)

Neural Information Processing Systems

Industry:

Information Technology (0.90)
Law (0.70)

Technology: Information Technology > Artificial Intelligence (0.70)

Add feedback

CRAG - Comprehensive RAG Benchmark Xiao Y ang

Neural Information Processing SystemsOct-9-2025, 19:02:29 GMT

Retrieval-Augmented Generation (RAG) has recently emerged as a promising solution to alleviate Large Language Model (LLM)'s deficiency in lack of knowledge. Existing RAG datasets, however, do not adequately represent the diverse and dynamic nature of real-world Question Answering (QA) tasks.

benchmark, information, truthfulness, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Leisure & Entertainment (1.00)
Media > Music (0.93)
Media > Film (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CRAG - Comprehensive RAG Benchmark

Neural Information Processing SystemsMay-26-2025, 16:48:43 GMT

Retrieval-Augmented Generation (RAG) has recently emerged as a promising solution to alleviate Large Language Model (LLM)'s deficiency in lack of knowledge. Existing RAG datasets, however, do not adequately represent the diverse and dynamic nature of real-world Question Answering (QA) tasks. To bridge this gap, we introduce the Comprehensive RAG Benchmark (CRAG), a factual question answering benchmark of 4,409 question-answer pairs and mock APIs to simulate web and Knowledge Graph (KG) search. CRAG is designed to encapsulate a diverse array of questions across five domains and eight question categories, reflecting varied entity popularity from popular to long-tail, and temporal dynamisms ranging from years to seconds. Our evaluation on this benchmark highlights the gap to fully trustworthy QA.

artificial intelligence, large language model, natural language, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

CRAG -- Comprehensive RAG Benchmark

Yang, Xiao, Sun, Kai, Xin, Hao, Sun, Yushi, Bhalla, Nikita, Chen, Xiangsen, Choudhary, Sajal, Gui, Rongze Daniel, Jiang, Ziran Will, Jiang, Ziyu, Kong, Lingkun, Moran, Brian, Wang, Jiaqi, Xu, Yifan Ethan, Yan, An, Yang, Chenyu, Yuan, Eting, Zha, Hanwen, Tang, Nan, Chen, Lei, Scheffer, Nicolas, Liu, Yue, Shah, Nirav, Wanga, Rakesh, Kumar, Anuj, Yih, Wen-tau, Dong, Xin Luna

arXiv.org Artificial IntelligenceJun-7-2024

Retrieval-Augmented Generation (RAG) has recently emerged as a promising solution to alleviate Large Language Model (LLM)'s deficiency in lack of knowledge. Existing RAG datasets, however, do not adequately represent the diverse and dynamic nature of real-world Question Answering (QA) tasks. To bridge this gap, we introduce the Comprehensive RAG Benchmark (CRAG), a factual question answering benchmark of 4,409 question-answer pairs and mock APIs to simulate web and Knowledge Graph (KG) search. CRAG is designed to encapsulate a diverse array of questions across five domains and eight question categories, reflecting varied entity popularity from popular to long-tail, and temporal dynamisms ranging from years to seconds. Our evaluation on this benchmark highlights the gap to fully trustworthy QA. Whereas most advanced LLMs achieve <=34% accuracy on CRAG, adding RAG in a straightforward manner improves the accuracy only to 44%. State-of-the-art industry RAG solutions only answer 63% questions without any hallucination. CRAG also reveals much lower accuracy in answering questions regarding facts with higher dynamism, lower popularity, or higher complexity, suggesting future research directions. The CRAG benchmark laid the groundwork for a KDD Cup 2024 challenge, attracting thousands of participants and submissions within the first 50 days of the competition. We commit to maintaining CRAG to serve research communities in advancing RAG solutions and general QA solutions.

benchmark, query, rag solution, (16 more...)

arXiv.org Artificial Intelligence

2406.04744

Genre: Research Report > Promising Solution (0.34)

Industry:

Leisure & Entertainment (1.00)
Media > Music (0.68)
Media > Film (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Clustered Retrieved Augmented Generation (CRAG)

Akesson, Simon, Santos, Frances A.

arXiv.org Artificial IntelligenceMay-24-2024

Providing external knowledge to Large Language Models (LLMs) is a key point for using these models in real-world applications for several reasons, such as incorporating up-to-date content in a real-time manner, providing access to domain-specific knowledge, and contributing to hallucination prevention. The vector database-based Retrieval Augmented Generation (RAG) approach has been widely adopted to this end. Thus, any part of external knowledge can be retrieved and provided to some LLM as the input context. Despite RAG approach's success, it still might be unfeasible for some applications, because the context retrieved can demand a longer context window than the size supported by LLM. Even when the context retrieved fits into the context window size, the number of tokens might be expressive and, consequently, impact costs and processing time, becoming impractical for most applications. To address these, we propose CRAG, a novel approach able to effectively reduce the number of prompting tokens without degrading the quality of the response generated compared to a solution using RAG. Through our experiments, we show that CRAG can reduce the number of tokens by at least 46\%, achieving more than 90\% in some cases, compared to RAG. Moreover, the number of tokens with CRAG does not increase considerably when the number of reviews analyzed is higher, unlike RAG, where the number of tokens is almost 9x higher when there are 75 reviews compared to 4 reviews.

crag, knowledge, llm, (14 more...)

arXiv.org Artificial Intelligence

2406.00029

Country:

Europe > France (0.05)
South America > Brazil > São Paulo > Campinas (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Kazakhstan > Akmola Region > Astana (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Corrective Retrieval Augmented Generation

Yan, Shi-Qi, Gu, Jia-Chen, Zhu, Yun, Ling, Zhen-Hua

arXiv.org Artificial IntelligenceJan-28-2024

Large language models (LLMs) inevitably exhibit hallucinations since the accuracy of generated texts cannot be secured solely by the parametric knowledge they encapsulate. Although retrieval-augmented generation (RAG) is a practicable complement to LLMs, it relies heavily on the relevance of retrieved documents, raising concerns about how the model behaves if retrieval goes wrong. To this end, we propose the Corrective Retrieval Augmented Generation (CRAG) to improve the robustness of generation. Specifically, a lightweight retrieval evaluator is designed to assess the overall quality of retrieved documents for a query, returning a confidence degree based on which different knowledge retrieval actions can be triggered. Since retrieval from static and limited corpora can only return sub-optimal documents, large-scale web searches are utilized as an extension for augmenting the retrieval results. Besides, a decompose-then-recompose algorithm is designed for retrieved documents to selectively focus on key information and filter out irrelevant information in them. CRAG is plug-and-play and can be seamlessly coupled with various RAG-based approaches. Experiments on four datasets covering short- and long-form generation tasks show that CRAG can significantly improve the performance of RAG-based approaches.

accuracy, crag, knowledge, (14 more...)

arXiv.org Artificial Intelligence

2401.15884

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Poland > Podlaskie Province (0.14)
Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.05)
(10 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports > Tennis (0.46)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reading The Game: Shadow Of Mordor

NPR TechnologyFeb-24-2017, 13:45:01 GMT

For years now, some of the best, wildest, most moving or revealing stories we've been telling ourselves have come not from books, movies or TV, but from video games. So we're running an occasional series, Reading The Game, in which we take a look at some of these games from a literary perspective. They march and they argue. They taunt their human slaves and, when they pass close enough, I can hear them talking about me -- Talion, called Gravewalker, murdered Captain of Gondor brought back to life by magic and the influence of my mostly-invisible elf/wraith buddy, Celebrimbor, who is a ghost that lives in my head. I am bored out of my elf-inhabited mind.

artificial intelligence, mordor, talion, (14 more...)

NPR Technology

Industry: Leisure & Entertainment > Games (0.57)

Technology: Information Technology > Artificial Intelligence > Games > Computer Games (0.43)

Add feedback