MiniRAG: Towards Extremely Simple Retrieval-Augmented Generation
Fan, Tianyu, Wang, Jingyuan, Ren, Xubin, Huang, Chao
–arXiv.org Artificial Intelligence
In on-device Retrieval Augmented Generation (RAG) systems, the limitations of device computational capabilities and data privacy restrict the use of powerful models, such as large language models and advanced text embedding models, necessitating reliance on smaller alternatives. Consequently, currently used pipelines heavily rely on LLMs for a comprehensive understanding of text semantics when computing embedding similarity for retrieval, facing significant challenges. These smaller models often struggle to capture the precise semantic nuances within lengthy texts, complicating accurate matching. To tackle these challenges, it is essential to: i) Reduce the complexity of input content for generation, ensuring that semantic information is clear and concise; ii) Shorten the length of input content for smaller language models, facilitating improved comprehension and retrieval accuracy. Additionally, employing effective graph indexing structures can help mitigate performance deficiencies in semantic matching, thereby enhancing the overall retrieval process. In MiniRAG, we propose a Graph-based Knowledge Retrieval mechanism that effectively leverages the semantic-aware heterogeneous graph G constructed during the indexing phase, in conjunction with lightweight text embeddings, to achieve efficient knowledge retrieval. By employing a graph-based search design, we aim to ease the burden on precise semantic matching with large language models. This approach facilitates the acquisition of rich and accurate textual content at a low computational cost, thereby enhancing the ability of language models to generate precise responses.
arXiv.org Artificial Intelligence
Jan-14-2025
- Genre:
- Research Report (0.83)
- Industry:
- Information Technology > Security & Privacy (0.54)
- Leisure & Entertainment (1.00)
- Media > Television (0.46)
- Technology: