Evaluating Self-Generated Documents for Enhancing Retrieval-Augmented Generation with Large Language Models
Li, Jiatao, Hu, Xinyu, Yin, Xunjian, Wan, Xiaojun
–arXiv.org Artificial Intelligence
The integration of documents generated by LLMs themselves (Self-Docs) alongside retrieved documents has emerged as a promising strategy for retrieval-augmented generation systems. However, previous research primarily focuses on optimizing the use of Self-Docs, with their inherent properties remaining underexplored. To bridge this gap, we first investigate the overall effectiveness of Self-Docs, identifying key factors that shape their contribution to RAG performance (RQ1). Building on these insights, we develop a taxonomy grounded in Systemic Functional Linguistics to compare the influence of various Self-Docs categories (RQ2) and explore strategies for combining them with external sources (RQ3). Our findings reveal which types of Self-Docs are most beneficial and offer practical guidelines for leveraging them to achieve significant improvements in knowledge-intensive question answering tasks.
arXiv.org Artificial Intelligence
Dec-14-2024
- Country:
- Asia > Russia (1.00)
- Europe (1.00)
- North America > United States
- Kentucky (0.16)
- Genre:
- Research Report > New Finding (0.87)
- Industry:
- Aerospace & Defense > Aircraft (1.00)
- Government
- Foreign Policy (0.68)
- Regional Government
- Asia Government > Russia Government (1.00)
- Europe Government > Russia Government (1.00)
- Leisure & Entertainment (1.00)
- Media
- Transportation > Air (1.00)
- Technology: