hipporag
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > Singapore (0.04)
- Asia > Indonesia > Bali (0.04)
- (16 more...)
- Research Report > Experimental Study (1.00)
- Overview (0.88)
- Research Report > New Finding (0.67)
- Media > Film (0.68)
- Leisure & Entertainment (0.68)
- Government (0.67)
- Health & Medicine > Therapeutic Area > Oncology (0.46)
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models
In order to thrive in hostile and ever-changing natural environments, mammalian brains evolved to store large amounts of knowledge about the world and continually integrate new information while avoiding catastrophic forgetting. Despite the impressive accomplishments, large language models (LLMs), even with retrieval-augmented generation (RAG), still struggle to efficiently and effectively integrate a large amount of new experiences after pre-training. In this work, we introduce HippoRAG, a novel retrieval framework inspired by the hippocampal indexing theory of human long-term memory to enable deeper and more efficient knowledge integration over new experiences. HippoRAG synergistically orchestrates LLMs, knowledge graphs, and the Personalized PageRank algorithm to mimic the different roles of neocortex and hippocampus in human memory. We compare HippoRAG with existing RAG methods on multi-hop question answering (QA) and show that our method outperforms the state-of-the-art methods remarkably, by up to 20%. Single-step retrieval with HippoRAG achieves comparable or better performance than iterative retrieval like IRCoT while being 10-20 times cheaper and 6-13 times faster, and integrating HippoRAG into IRCoT brings further substantial gains. Finally, we show that our method can tackle new types of scenarios that are out of reach of existing methods.
Graph-Guided Concept Selection for Efficient Retrieval-Augmented Generation
Liu, Ziyu, Liu, Yijing, Yuan, Jianfei, Yan, Minzhi, Yue, Le, Xiong, Honghui, Yang, Yi
Graph-based RAG constructs a knowledge graph (KG) from text chunks to enhance retrieval in Large Language Model (LLM)-based question answering. It is especially beneficial in domains such as biomedicine, law, and political science, where effective retrieval often involves multi-hop reasoning over proprietary documents. However, these methods demand numerous LLM calls to extract entities and relations from text chunks, incurring prohibitive costs at scale. Through a carefully designed ablation study, we observe that certain words (termed concepts) and their associated documents are more important. Based on this insight, we propose Graph-Guided Concept Selection (G2ConS). Its core comprises a chunk selection method and an LLM-independent concept graph. The former selects salient document chunks to reduce KG construction costs; the latter closes knowledge gaps introduced by chunk selection at zero cost. Evaluations on multiple real-world datasets show that G2ConS outperforms all baselines in construction cost, retrieval effectiveness, and answering quality.
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- North America > United States > New Jersey (0.04)
- Asia > China (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > Singapore (0.04)
- Asia > Indonesia > Bali (0.04)
- (16 more...)
- Research Report > Experimental Study (1.00)
- Overview (0.68)
- Research Report > New Finding (0.67)
- Media > Film (0.68)
- Leisure & Entertainment (0.68)
- Government (0.67)
- Health & Medicine > Therapeutic Area > Oncology (0.46)
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models
In order to thrive in hostile and ever-changing natural environments, mammalian brains evolved to store large amounts of knowledge about the world and continually integrate new information while avoiding catastrophic forgetting. Despite the impressive accomplishments, large language models (LLMs), even with retrieval-augmented generation (RAG), still struggle to efficiently and effectively integrate a large amount of new experiences after pre-training. In this work, we introduce HippoRAG, a novel retrieval framework inspired by the hippocampal indexing theory of human long-term memory to enable deeper and more efficient knowledge integration over new experiences. HippoRAG synergistically orchestrates LLMs, knowledge graphs, and the Personalized PageRank algorithm to mimic the different roles of neocortex and hippocampus in human memory. We compare HippoRAG with existing RAG methods on multi-hop question answering (QA) and show that our method outperforms the state-of-the-art methods remarkably, by up to 20%.
GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation
Luo, Linhao, Zhao, Zicheng, Haffari, Gholamreza, Phung, Dinh, Gong, Chen, Pan, Shirui
Retrieval-augmented generation (RAG) has proven effective in integrating knowledge into large language models (LLMs). However, conventional RAGs struggle to capture complex relationships between pieces of knowledge, limiting their performance in intricate reasoning that requires integrating knowledge from multiple sources. Recently, graph-enhanced retrieval augmented generation (GraphRAG) builds graph structure to explicitly model these relationships, enabling more effective and efficient retrievers. Nevertheless, its performance is still hindered by the noise and incompleteness within the graph structure. To address this, we introduce GFM-RAG, a novel graph foundation model (GFM) for retrieval augmented generation. GFM-RAG is powered by an innovative graph neural network that reasons over graph structure to capture complex query-knowledge relationships. The GFM with 8M parameters undergoes a two-stage training process on large-scale datasets, comprising 60 knowledge graphs with over 14M triples and 700k documents. This results in impressive performance and generalizability for GFM-RAG, making it the first graph foundation model applicable to unseen datasets for retrieval without any fine-tuning required. Extensive experiments on three multi-hop QA datasets and seven domain-specific RAG datasets demonstrate that GFM-RAG achieves state-of-the-art performance while maintaining efficiency and alignment with neural scaling laws, highlighting its potential for further improvement.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- Pacific Ocean (0.04)
- (7 more...)
GeAR: Graph-enhanced Agent for Retrieval-augmented Generation
Shen, Zhili, Diao, Chenxin, Vougiouklis, Pavlos, Merita, Pascual, Piramanayagam, Shriram, Graux, Damien, Tu, Dandan, Jiang, Zeren, Lai, Ruofei, Ren, Yang, Pan, Jeff Z.
Retrieval-augmented generation systems rely on effective document retrieval capabilities. By design, conventional sparse or dense retrievers face challenges in multi-hop retrieval scenarios. In this paper, we present GeAR, which advances RAG performance through two key innovations: (i) graph expansion, which enhances any conventional base retriever, such as BM25, and (ii) an agent framework that incorporates graph expansion. Our evaluation demonstrates GeAR's superior retrieval performance on three multi-hop question answering datasets. Additionally, our system achieves state-of-the-art results with improvements exceeding 10% on the challenging MuSiQue dataset, while requiring fewer tokens and iterations compared to other multi-step retrieval systems.
- North America > United States > Michigan (0.04)
- Asia > Indonesia > Java > Jakarta > Jakarta (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- (13 more...)
- Media > Film (1.00)
- Leisure & Entertainment > Sports > Basketball (0.93)
- Education (0.93)
SiReRAG: Indexing Similar and Related Information for Multihop Reasoning
Zhang, Nan, Choubey, Prafulla Kumar, Fabbri, Alexander, Bernadett-Shapiro, Gabriel, Zhang, Rui, Mitra, Prasenjit, Xiong, Caiming, Wu, Chien-Sheng
Indexing is an important step towards strong performance in retrieval-augmented generation (RAG) systems. However, existing methods organize data based on either semantic similarity (similarity) or related information (relatedness), but do not cover both perspectives comprehensively. Our analysis reveals that modeling only one perspective results in insufficient knowledge synthesis, leading to suboptimal performance on complex tasks requiring multihop reasoning. In this paper, we propose SiReRAG, a novel RAG indexing approach that explicitly considers both similar and related information. On the similarity side, we follow existing work and explore some variances to construct a similarity tree based on recursive summarization. On the relatedness side, SiReRAG extracts propositions and entities from texts, groups propositions via shared entities, and generates recursive summaries to construct a relatedness tree. We index and flatten both similarity and relatedness trees into a unified retrieval pool. Our experiments demonstrate that SiReRAG consistently outperforms state-of-the-art indexing methods on three multihop datasets (MuSiQue, 2WikiMultiHopQA, and HotpotQA), with an average 1.9% improvement in F1 scores. As a reasonably efficient solution, SiReRAG enhances existing reranking methods significantly, with up to 7.8% improvement in average F1 scores.
- North America > United States > California (0.14)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Washington > Walla Walla County > Walla Walla (0.04)
- (13 more...)
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models
Gutiérrez, Bernal Jiménez, Shu, Yiheng, Gu, Yu, Yasunaga, Michihiro, Su, Yu
In order to thrive in hostile and ever-changing natural environments, mammalian brains evolved to store large amounts of knowledge about the world and continually integrate new information while avoiding catastrophic forgetting. Despite the impressive accomplishments, large language models (LLMs), even with retrieval-augmented generation (RAG), still struggle to efficiently and effectively integrate a large amount of new experiences after pre-training. In this work, we introduce HippoRAG, a novel retrieval framework inspired by the hippocampal indexing theory of human long-term memory to enable deeper and more efficient knowledge integration over new experiences. HippoRAG synergistically orchestrates LLMs, knowledge graphs, and the Personalized PageRank algorithm to mimic the different roles of neocortex and hippocampus in human memory. We compare HippoRAG with existing RAG methods on multi-hop question answering and show that our method outperforms the state-of-the-art methods remarkably, by up to 20%. Single-step retrieval with HippoRAG achieves comparable or better performance than iterative retrieval like IRCoT while being 10-30 times cheaper and 6-13 times faster, and integrating HippoRAG into IRCoT brings further substantial gains. Finally, we show that our method can tackle new types of scenarios that are out of reach of existing methods. Code and data are available at https://github.com/OSU-NLP-Group/HippoRAG.
- Europe > France (0.14)
- Asia > Timor-Leste (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- (23 more...)
- Leisure & Entertainment (1.00)
- Government (0.93)
- Media > Film (0.68)
- Health & Medicine > Therapeutic Area > Neurology (0.49)