BERGEN: A Benchmarking Library for Retrieval-Augmented Generation

Rau, David, Déjean, Hervé, Chirkova, Nadezhda, Formal, Thibault, Wang, Shuai, Nikoulina, Vassilina, Clinchant, Stéphane

Jul-1-2024–arXiv.org Artificial Intelligence

Retrieval-Augmented Generation allows to enhance Large Language Models with external knowledge. In response to the recent popularity of generative LLMs, many RAG approaches have been proposed, which involve an intricate number of different configurations such as evaluation datasets, collections, metrics, retrievers, and LLMs. Inconsistent benchmarking poses a major challenge in comparing approaches and understanding the impact of each component in the pipeline. In this work, we study best practices that lay the groundwork for a systematic evaluation of RAG and present BERGEN, an end-to-end library for reproducible research standardizing RAG experiments. In an extensive study focusing on QA, we benchmark different state-of-the-art retrievers, rerankers, and LLMs. Additionally, we analyze existing RAG metrics and datasets. Our open-source library BERGEN is available under \url{https://github.com/naver/bergen}.

benchmarking library, dataset, retrieval, (15 more...)

arXiv.org Artificial Intelligence

Jul-1-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - Dominican Republic (0.04)
  - United States > New York
    - New York County > New York City (0.04)
  - Canada
    - Ontario > Toronto (0.04)
    - British Columbia > Metro Vancouver Regional District
      - Vancouver (0.04)
- Europe
  - Switzerland (0.04)
  - Monaco (0.04)
  - Spain > Valencian Community
    - Valencia Province > Valencia (0.04)
  - Portugal > Lisbon
    - Lisbon (0.04)
  - Italy
    - Tuscany > Florence (0.04)
    - Calabria > Catanzaro Province
      - Catanzaro (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia
  - Singapore (0.04)
  - Myanmar > Tanintharyi Region
    - Dawei (0.04)
  - Middle East
    - UAE (0.14)
    - Jordan (0.04)
  - Japan > Honshū
    - Chūbu > Toyama Prefecture > Toyama (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found