Fine-tune the Entire RAG Architecture (including DPR retriever) for Question-Answering

Siriwardhana, Shamane, Weerasekera, Rivindu, Wen, Elliott, Nanayakkara, Suranga

Jun-21-2021–arXiv.org Artificial Intelligence

In September 2020, Facebook open-sourced a new NLP model called Retrieval Augmented Generation (RAG) on the Hugging Face Transformer library. RAG is capable to use a set of support documents from an external knowledge base as a latent variable to generate the final output. The RAG model consists of an Input Encoder, a Neural Retriever, and an Output Generator. All three components are initialized with pre-trained transformers. However, the original Hugging Face implementation only allowed fine-tuning the Input Encoder and the Output Generator in an end-toend manner, while the Neural Retriever needs to be trained seperately. To the best of our knowledge, an end-to-end RAG implementation that trains all three components does not exist.

machine learning, natural language, question answering, (16 more...)

arXiv.org Artificial Intelligence

Jun-21-2021

arXiv.org PDF

Add feedback

Country:
- Oceania > New Zealand
  - North Island > Auckland Region > Auckland (0.10)
- Europe > Italy
  - Calabria > Catanzaro Province > Catanzaro (0.05)

Genre:
- Research Report (0.83)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Question Answering (0.45)
  - Machine Learning > Neural Networks
    - Deep Learning (0.36)