Exploring the Practicality of Generative Retrieval on Dynamic Corpora
Yoon, Soyoung, Kim, Chaeeun, Lee, Hyunji, Jang, Joel, Yang, Sohee, Seo, Minjoon
–arXiv.org Artificial Intelligence
Benchmarking the performance of information retrieval (IR) methods are mostly conducted with a fixed set of documents (static corpora); in realistic scenarios, this is rarely the case and the document to be retrieved are constantly updated and added. In this paper, we focus on conducting a comprehensive comparison between two categories of contemporary retrieval systems, Dual Encoders (DE) and Generative Retrievals (GR), in a dynamic scenario where the corpora to be retrieved is updated. We also conduct an extensive evaluation of computational and memory efficiency, crucial factors for IR systems for real-world deployment. Our results demonstrate that GR is more adaptable to evolving knowledge (+13-18% on the StreamingQA Benchmark), robust in handling data with temporal information (x 10 times), and efficient in terms of memory (x 4 times), indexing time (x 6 times), and inference flops (x 10 times). Our paper highlights GR's potential for future use in practical IR systems.
arXiv.org Artificial Intelligence
Nov-16-2023
- Country:
- Europe > United Kingdom (0.04)
- North America
- Mexico (0.04)
- United States
- New York > New York County
- New York City (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New York > New York County
- Asia > Myanmar
- Tanintharyi Region > Dawei (0.04)
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Technology: