Goto

Collaborating Authors

 Information Retrieval


In-Context Adaptation to Concept Drift for Learned Database Operations

arXiv.org Artificial Intelligence

Machine learning has demonstrated transformative potential for database operations, such as query optimization and in-database data analytics. However, dynamic database environments, characterized by frequent updates and evolving data distributions, introduce concept drift, which leads to performance degradation for learned models and limits their practical applicability. Addressing this challenge requires efficient frameworks capable of adapting to shifting concepts while minimizing the overhead of retraining or fine-tuning. In this paper, we propose FLAIR, an online adaptation framework that introduces a new paradigm called \textit{in-context adaptation} for learned database operations. FLAIR leverages the inherent property of data systems, i.e., immediate availability of execution results for predictions, to enable dynamic context construction. By formalizing adaptation as $f:(\mathbf{x} \,| \,C_t) \to \mathbf{y}$, with $C_t$ representing a dynamic context memory, FLAIR delivers predictions aligned with the current concept, eliminating the need for runtime parameter optimization. To achieve this, FLAIR integrates two key modules: a Task Featurization Module for encoding task-specific features into standardized representations, and a Dynamic Decision Engine, pre-trained via Bayesian meta-training, to adapt seamlessly using contextual information at runtime. Extensive experiments across key database tasks demonstrate that FLAIR outperforms state-of-the-art baselines, achieving up to 5.2x faster adaptation and reducing error by 22.5% for cardinality estimation.


MIRB: Mathematical Information Retrieval Benchmark

arXiv.org Artificial Intelligence

Mathematical Information Retrieval (MIR) is the task of retrieving information from mathematical documents and plays a key role in various applications, including theorem search in mathematical libraries, answer retrieval on math forums, and premise selection in automated theorem proving. However, a unified benchmark for evaluating these diverse retrieval tasks has been lacking. In this paper, we introduce MIRB (Mathematical Information Retrieval Benchmark) to assess the MIR capabilities of retrieval models. MIRB includes four tasks: semantic statement retrieval, question-answer retrieval, premise retrieval, and formula retrieval, spanning a total of 12 datasets. We evaluate 13 retrieval models on this benchmark and analyze the challenges inherent to MIR. We hope that MIRB provides a comprehensive framework for evaluating MIR systems and helps advance the development of more effective retrieval models tailored to the mathematical domain.


Single LLM, Multiple Roles: A Unified Retrieval-Augmented Generation Framework Using Role-Specific Token Optimization

arXiv.org Artificial Intelligence

Existing studies have optimized retrieval-augmented generation (RAG) across various sub-tasks, such as query understanding and retrieval refinement, but integrating these optimizations into a unified framework remains challenging. To tackle this problem, this work proposes RoleRAG, a unified RAG framework that achieves efficient multi-task processing through role-specific token optimization. RoleRAG comprises six modules, each handling a specific sub-task within the RAG process. Additionally, we introduce a query graph to represent the decomposition of the query, which can be dynamically resolved according to the decomposing state. All modules are driven by the same underlying LLM, distinguished by task-specific role tokens that are individually optimized. This design allows RoleRAG to dynamically activate different modules within a single LLM instance, thereby streamlining deployment and reducing resource consumption. Experimental results on five open-domain question-answering datasets demonstrate the effectiveness, generalizability, and flexibility of our framework.


An Alternative to FLOPS Regularization to Effectively Productionize SPLADE-Doc

arXiv.org Artificial Intelligence

Learned Sparse Retrieval (LSR) models encode text as weighted term vectors, which need to be sparse to leverage inverted index structures during retrieval. SPLADE, the most popular LSR model, uses FLOPS regularization to encourage vector sparsity during training. However, FLOPS regularization does not ensure sparsity among terms - only within a given query or document. Terms with very high Document Frequencies (DFs) substantially increase latency in production retrieval engines, such as Apache Solr, due to their lengthy posting lists. To address the issue of high DFs, we present a new variant of FLOPS regularization: DF-FLOPS. This new regularization technique penalizes the usage of high-DF terms, thereby shortening posting lists and reducing retrieval latency. Unlike other inference-time sparsification methods, such as stopword removal, DF-FLOPS regularization allows for the selective inclusion of high-frequency terms in cases where the terms are truly salient. We find that DF-FLOPS successfully reduces the prevalence of high-DF terms and lowers retrieval latency (around 10x faster) in a production-grade engine while maintaining effectiveness both in-domain (only a 2.2-point drop in MRR@10) and cross-domain (improved performance in 12 out of 13 tasks on which we tested). With retrieval latencies on par with BM25, this work provides an important step towards making LSR practical for deployment in production-grade search engines.


Efficient and Scalable Neural Symbolic Search for Knowledge Graph Complex Query Answering

arXiv.org Artificial Intelligence

Complex Query Answering (CQA) aims to retrieve answer sets for complex logical formulas from incomplete knowledge graphs, which is a crucial yet challenging task in knowledge graph reasoning. While neuro-symbolic search utilized neural link predictions achieve superior accuracy, they encounter significant complexity bottlenecks: (i) Data complexity typically scales quadratically with the number of entities in the knowledge graph, and (ii) Query complexity becomes NP-hard for cyclic queries. Consequently, these approaches struggle to effectively scale to larger knowledge graphs and more complex queries. To address these challenges, we propose an efficient and scalable symbolic search framework. First, we propose two constraint strategies to compute neural logical indices to reduce the domain of variables, thereby decreasing the data complexity of symbolic search. Additionally, we introduce an approximate algorithm based on local search to tackle the NP query complexity of cyclic queries. Experiments on various CQA benchmarks demonstrate that our framework reduces the computational load of symbolic methods by 90\% while maintaining nearly the same performance, thus alleviating both efficiency and scalability issues.


Enhancing Keyphrase Extraction from Academic Articles Using Section Structure Information

arXiv.org Artificial Intelligence

The exponential increase in academic papers has significantly increased the time required for researchers to access relevant literature. Keyphrase Extraction (KPE) offers a solution to this situation by enabling researchers to efficiently retrieve relevant literature. The current study on KPE from academic articles aims to improve the performance of extraction models through innovative approaches using Title and Abstract as input corpora. However, the semantic richness of keywords is significantly constrained by the length of the abstract. While full-text-based KPE can address this issue, it simultaneously introduces noise, which significantly diminishes KPE performance. To address this issue, this paper utilized the structural features and section texts obtained from the section structure information of academic articles to extract keyphrase from academic papers. The approach consists of two main parts: (1) exploring the effect of seven structural features on KPE models, and (2) integrating the extraction results from all section texts used as input corpora for KPE models via a keyphrase integration algorithm to obtain the keyphrase integration result. Furthermore, this paper also examined the effect of the classification quality of section structure on the KPE performance. The results show that incorporating structural features improves KPE performance, though different features have varying effects on model efficacy. The keyphrase integration approach yields the best performance, and the classification quality of section structure can affect KPE performance. These findings indicate that using the section structure information of academic articles contributes to effective KPE from academic articles. The code and dataset supporting this study are available at https://github.com/yan-xinyi/SSB_KPE.


Cross-Domain Diffusion with Progressive Alignment for Efficient Adaptive Retrieval

arXiv.org Artificial Intelligence

--Unsupervised efficient domain adaptive retrieval aims to transfer knowledge from a labeled source domain to an unlabeled target domain, while maintaining low storage cost and high retrieval efficiency. However, existing methods typically fail to address potential noise in the target domain, and directly align high-level features across domains, thus resulting in suboptimal retrieval performance. T o address these challenges, we propose a novel Cross-Domain Diffusion with Progressive Alignment method (COUPLE). This approach revisits unsupervised efficient domain adaptive retrieval from a graph diffusion perspective, simulating cross-domain adaptation dynamics to achieve a stable target domain adaptation process. First, we construct a cross-domain relationship graph and leverage noise-robust graph flow diffusion to simulate the transfer dynamics from the source domain to the target domain, identifying lower noise clusters. Furthermore, we employ a hierarchical Mixup operation for progressive domain alignment, which is performed along the cross-domain random walk paths. Utilizing target domain discriminative hash learning and progressive domain alignment, COUPLE enables effective domain adaptive hash learning. Extensive experiments demonstrate COUPLE's effectiveness on competitive benchmarks. PPROXIMA TE nearest neighbor (ANN) [1], [2] search, which aims to efficiently find data samples in a dataset that are close to a given query sample within an acceptable margin of error, has garnered significant attention due to its wide range of applications, e.g., image retrieval [3], [4], search engines [5], recommender system [6] and retrieval-augmented generation (RAG) [7], [8]. Hash-based ANN retrieval [9]-[13] offers higher efficiency and lower storage costs by replacing computationally expensive pairwise distance calculations with bit-wise XOR and bit-counting operations [14]. Junyu Luo, Y usheng Zhao, Wei Ju, Ming Zhang are with State Key Laboratory for Multimedia Information Processing, School of Computer Science, Peking University, Beijing, China. Xiao Luo is with Department of Computer Science, University of California, Los Angeles, USA. Zhiping Xiao is with Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, USA. Li Shen is with JD Explore Academy.


JIR-Arena: The First Benchmark Dataset for Just-in-time Information Recommendation

arXiv.org Artificial Intelligence

Just-in-time Information Recommendation (JIR) is a service designed to deliver the most relevant information precisely when users need it, , addressing their knowledge gaps with minimal effort and boosting decision-making and efficiency in daily life. Advances in device-efficient deployment of foundation models and the growing use of intelligent wearable devices have made always-on JIR assistants feasible. However, there has been no systematic effort to formally define JIR tasks or establish evaluation frameworks. To bridge this gap, we present the first mathematical definition of JIR tasks and associated evaluation metrics. Additionally, we introduce JIR-Arena, a multimodal benchmark dataset featuring diverse, information-request-intensive scenarios to evaluate JIR systems across critical dimensions: i) accurately inferring user information needs, ii) delivering timely and relevant recommendations, and iii) avoiding irrelevant content that may distract users. Developing a JIR benchmark dataset poses challenges due to subjectivity in estimating user information needs and uncontrollable system variables affecting reproducibility. To address these, JIR-Arena: i) combines input from multiple humans and large AI models to approximate information need distributions; ii) assesses JIR quality through information retrieval outcomes using static knowledge base snapshots; and iii) employs a multi-turn, multi-entity validation framework to improve objectivity and generality. Furthermore, we implement a baseline JIR system capable of processing real-time information streams aligned with user inputs. Our evaluation of this baseline system on JIR-Arena indicates that while foundation model-based JIR systems simulate user needs with reasonable precision, they face challenges in recall and effective content retrieval. To support future research in this new area, we fully release our code and data.


Google unveils 'AI Mode' in the next phase of its journey to change search

The Guardian

Google on Tuesday unleashed another wave of artificial intelligence technology to accelerate a year-long makeover of its search engine that is changing the way people get information and curtailing the flow of internet traffic to other websites. The next phase outlined at Google's annual developers conference includes releasing a new "AI mode" option in the United States. The company says the feature will make interacting with its search engine more like having a conversation with an expert capable of answering a wide array of questions. AI mode is being offered to all users in the US just two-and-a-half months after the company began testing with a limited Labs division audience. Google is also feeding its latest AI model, Gemini 2.5, into its search algorithms and will soon begin testing other AI features, such as the ability to automatically buy concert tickets and conduct searches through live video feeds.


Second SIGIR Workshop on Simulations for Information Access (Sim4IA 2025)

arXiv.org Artificial Intelligence

Simulations in information access (IA) have recently gained interest, as shown by various tutorials and workshops around that topic. Simulations can be key contributors to central IA research and evaluation questions, especially around interactive settings when real users are unavailable, or their participation is impossible due to ethical reasons. In addition, simulations in IA can help contribute to a better understanding of users, reduce complexity of evaluation experiments, and improve reproducibility. Building on recent developments in methods and toolkits, the second iteration of our Sim4IA workshop aims to again bring together researchers and practitioners to form an interactive and engaging forum for discussions on the future perspectives of the field. An additional aim is to plan an upcoming TREC/CLEF campaign.