Fine-tune the Entire RAG Architecture (including DPR retriever) for Question-Answering

Siriwardhana, Shamane, Weerasekera, Rivindu, Wen, Elliott, Nanayakkara, Suranga

arXiv.org Artificial Intelligence 

In September 2020, Facebook open-sourced a new NLP model called Retrieval Augmented Generation (RAG) on the Hugging Face Transformer library. RAG is capable to use a set of support documents from an external knowledge base as a latent variable to generate the final output. The RAG model consists of an Input Encoder, a Neural Retriever, and an Output Generator. All three components are initialized with pre-trained transformers. However, the original Hugging Face implementation only allowed fine-tuning the Input Encoder and the Output Generator in an end-toend manner, while the Neural Retriever needs to be trained seperately. To the best of our knowledge, an end-to-end RAG implementation that trains all three components does not exist.