Query Processing
AstuteRAG-FQA: Task-Aware Retrieval-Augmented Generation Framework for Proprietary Data Challenges in Financial Question Answering
Alam, Mohammad Zahangir, Zaman, Khandoker Ashik Uz, Miraz, Mahdi H.
Retrieval-Augmented Generation (RAG) shows significant promise in knowledge-intensive tasks by improving domain specificity, enhancing temporal relevance, and reducing hallucinations. However, applying RAG to finance encounters critical challenges: restricted access to proprietary datasets, limited retrieval accuracy, regulatory constraints, and sensitive data interpretation. We introduce AstuteRAG-FQA, an adaptive RAG framework tailored for Financial Question Answering (FQA), leveraging task-aware prompt engineering to address these challenges. The framework uses a hybrid retrieval strategy integrating both open-source and proprietary financial data while maintaining strict security protocols and regulatory compliance. A dynamic prompt framework adapts in real time to query complexity, improving precision and contextual relevance. To systematically address diverse financial queries, we propose a four-tier task classification: explicit factual, implicit factual, interpretable rationale, and hidden rationale involving implicit causal reasoning. For each category, we identify key challenges, datasets, and optimization techniques within the retrieval and generation process. The framework incorporates multi-layered security mechanisms including differential privacy, data anonymization, and role-based access controls to protect sensitive financial information. Additionally, AstuteRAG-FQA implements real-time compliance monitoring through automated regulatory validation systems that verify responses against industry standards and legal obligations. We evaluate three data integration techniques - contextual embedding, small model augmentation, and targeted fine-tuning - analyzing their efficiency and feasibility across varied financial environments.
ARIMA_PLUS: Large-scale, Accurate, Automatic and Interpretable In-Database Time Series Forecasting and Anomaly Detection in Google BigQuery
Cheng, Xi, Shen, Weijie, Chen, Haoming, Shen, Chaoyi, Ortega, Jean, Liu, Jiashang, Thomas, Steve, Zheng, Honglin, Wu, Haoyun, Li, Yuxiang, Lichtendahl, Casey, Ortiz, Jenny, Liu, Gang, Qi, Haiyang, Fatemieh, Omid, Fry, Chris, Long, Jing Jing
Time series forecasting and anomaly detection are common tasks for practitioners in industries such as retail, manufacturing, advertising and energy. Two unique challenges stand out: (1) efficiently and accurately forecasting time series or detecting anomalies in large volumes automatically; and (2) ensuring interpretability of results to effectively incorporate business insights. We present ARIMA_PLUS, a novel framework to overcome these two challenges by a unique combination of (a) accurate and interpretable time series models and (b) scalable and fully managed system infrastructure. The model has a sequential and modular structure to handle different components of the time series, including holiday effects, seasonality, trend, and anomalies, which enables high interpretability of the results. Novel enhancements are made to each module, and a unified framework is established to address both forecasting and anomaly detection tasks simultaneously. In terms of accuracy, its comprehensive benchmark on the 42 public datasets in the Monash forecasting repository shows superior performance over not only well-established statistical alternatives (such as ETS, ARIMA, TBATS, Prophet) but also newer neural network models (such as DeepAR, N-BEATS, PatchTST, TimeMixer). In terms of infrastructure, it is directly built into the query engine of BigQuery in Google Cloud. It uses a simple SQL interface and automates tedious technicalities such as data cleaning and model selection. It automatically scales with managed cloud computational and storage resources, making it possible to forecast 100 million time series using only 1.5 hours with a throughput of more than 18000 time series per second. In terms of interpretability, we present several case studies to demonstrate time series insights it generates and customizability it offers.
Query Complexity of Classical and Quantum Channel Discrimination
Nuradha, Theshani, Wilde, Mark M.
Quantum channel discrimination has been studied from an information-theoretic perspective, wherein one is interested in the optimal decay rate of error probabilities as a function of the number of unknown channel accesses. In this paper, we study the query complexity of quantum channel discrimination, wherein the goal is to determine the minimum number of channel uses needed to reach a desired error probability. To this end, we show that the query complexity of binary channel discrimination depends logarithmically on the inverse error probability and inversely on the negative logarithm of the (geometric and Holevo) channel fidelity. As a special case of these findings, we precisely characterize the query complexity of discriminating two classical channels and two classical-quantum channels. Furthermore, by obtaining an optimal characterization of the sample complexity of quantum hypothesis testing, including prior probabilities, we provide a more precise characterization of query complexity when the error probability does not exceed a fixed threshold. We also provide lower and upper bounds on the query complexity of binary asymmetric channel discrimination and multiple quantum channel discrimination. For the former, the query complexity depends on the geometric Rรฉnyi and Petz Rรฉnyi channel divergences, while for the latter, it depends on the negative logarithm of the (geometric and Uhlmann) channel fidelity. For multiple channel discrimination, the upper bound scales as the logarithm of the number of channels.
A Survey of Data Agents: Emerging Paradigm or Overstated Hype?
Zhu, Yizhang, Wang, Liangwei, Yang, Chenyu, Lin, Xiaotian, Li, Boyan, Zhou, Wei, Liu, Xinyu, Peng, Zhangyang, Luo, Tianqi, Li, Yu, Chai, Chengliang, Chen, Chong, Di, Shimin, Fan, Ju, Sun, Ji, Tang, Nan, Tsung, Fugee, Wang, Jiannan, Wu, Chenglin, Xu, Yanwei, Zhang, Shaolei, Zhang, Yong, Zhou, Xuanhe, Li, Guoliang, Luo, Yuyu
The rapid advancement of large language models (LLMs) has spurred the emergence of data agents--autonomous systems designed to orchestrate Data + AI ecosystems for tackling complex data-related tasks. However, the term "data agent" currently suffers from terminological ambiguity and inconsistent adoption, conflating simple query responders with sophisticated autonomous architectures. This terminological ambiguity fosters mismatched user expectations, accountability challenges, and barriers to industry growth. Inspired by the SAE J3016 standard for driving automation, this survey introduces the first systematic hierarchical taxonomy for data agents, comprising six levels that delineate and trace progressive shifts in autonomy, from manual operations (L0) to a vision of generative, fully autonomous data agents (L5), thereby clarifying capability boundaries and responsibility allocation. Through this lens, we offer a structured review of existing research arranged by increasing autonomy, encompassing specialized data agents for data management, preparation, and analysis, alongside emerging efforts toward versatile, comprehensive systems with enhanced autonomy. We further analyze critical evolutionary leaps and technical gaps for advancing data agents, especially the ongoing L2-to-L3 transition, where data agents evolve from procedural execution to autonomous orchestration. Finally, we conclude with a forward-looking roadmap, envisioning the advent of proactive, generative data agents.
AI Guided Accelerator For Search Experience
Yetukuri, Jayanth, Elyasi, Mehran, Agrawal, Samarth, Mandal, Aritra, Kong, Rui, Vempati, Harish, Khan, Ishita
Effective query reformulation is pivotal in narrowing the gap between a user's exploratory search behavior and the identification of relevant products in e-commerce environments. While traditional approaches predominantly model query rewrites as isolated pairs, they often fail to capture the sequential and transitional dynamics inherent in real-world user behavior. In this work, we propose a novel framework that explicitly models transitional queries--intermediate reformulations occurring during the user's journey toward their final purchase intent. By mining structured query trajectories from eBay's large-scale user interaction logs, we reconstruct query sequences that reflect shifts in intent while preserving semantic coherence. This approach allows us to model a user's shopping funnel, where mid-journey transitions reflect exploratory behavior and intent refinement. Furthermore, we incorporate generative Large Language Models (LLMs) to produce semantically diverse and intent-preserving alternative queries, extending beyond what can be derived through collaborative filtering alone. These reformulations can be leveraged to populate Related Searches or to power intent-clustered carousels on the search results page, enhancing both discovery and engagement. Our contributions include (i) the formal identification and modeling of transitional queries, (ii) the introduction of a structured query sequence mining pipeline for intent flow understanding, and (iii) the application of LLMs for scalable, intent-aware query expansion. Empirical evaluation demonstrates measurable gains in conversion and engagement metrics compared to the existing Related Searches module, validating the effectiveness of our approach in real-world e-commerce settings.
FACTS: Table Summarization via Offline Template Generation with Agentic Workflows
Yuan, Ye, Shabani, Mohammad Amin, Liu, Siqi
Query-focused table summarization requires generating natural language summaries of tabular data conditioned on a user query, enabling users to access insights beyond fact retrieval. Existing approaches face key limitations: table-to-text models require costly fine-tuning and struggle with complex reasoning, prompt-based LLM methods suffer from token-limit and efficiency issues while exposing sensitive data, and prior agentic pipelines often rely on decomposition, planning, or manual templates that lack robustness and scalability. To mitigate these issues, we introduce an agentic workflow, FACTS, a Fast, Accurate, and Privacy-Compliant Table Summarization approach via Offline Template Generation. FACTS produces offline templates, consisting of SQL queries and Jinja2 templates, which can be rendered into natural language summaries and are reusable across multiple tables sharing the same schema. It enables fast summarization through reusable offline templates, accurate outputs with executable SQL queries, and privacy compliance by sending only table schemas to LLMs. Evaluations on widely-used benchmarks show that FACTS consistently outperforms baseline methods, establishing it as a practical solution for real-world query-focused table summarization.
AdaptJobRec: Enhancing Conversational Career Recommendation through an LLM-Powered Agentic System
Wang, Qixin, Wang, Dawei, Chen, Kun, Hu, Yaowei, Girdhar, Puneet, Wang, Ruoteng, Gupta, Aadesh, Devella, Chaitanya, Guo, Wenlai, Huang, Shangwen, Aoun, Bachir, Hayworth, Greg, Li, Han, Wu, Xintao
In recent years, recommendation systems have evolved from providing a single list of recommendations to offering a comprehensive suite of topic-focused services. To better accomplish this task, conversational recommendation systems (CRS) have progressed from basic retrieval-augmented LLM generation to agentic systems with advanced reasoning and self-correction capabilities. However, agentic systems come with notable response latency--a longstanding challenge for conversational recommendation systems. To balance the trade-off between handling complex queries and minimizing latency, we propose AdaptJobRec, the first conversational job recommendation system that leverages autonomous agent to integrate personalized recommendation algorithm tools. The system employs a user query complexity identification mechanism to minimize response latency. For straightforward queries, the agent directly selects the appropriate tool for rapid responses. For complex queries, the agent uses the memory processing module to filter chat history for relevant content, then passes the results to the intelligent task decomposition planner, and finally executes the tasks using personalized recommendation tools. Evaluation on Walmart's real-world career recommendation scenarios demonstrates that AdaptJobRec reduces average response latency by up to 53.3% compared to competitive baselines, while significantly improving recommendation accuracy.
GrASP: A Generalizable Address-based Semantic Prefetcher for Scalable Transactional and Analytical Workloads
Zirak, Farzaneh, Choudhury, Farhana, Borovica-Gajic, Renata
Data prefetching--loading data into the cache before it is requested--is essential for reducing I/O overhead and improving database performance. While traditional prefetchers focus on sequential patterns, recent learning-based approaches, especially those leveraging data semantics, achieve higher accuracy for complex access patterns. However, these methods often struggle with today's dynamic, ever-growing datasets and require frequent, timely fine-tuning. Privacy constraints may also restrict access to complete datasets, necessitating prefetchers that can learn effectively from samples. To address these challenges, we present GrASP, a learning-based prefetcher designed for both analytical and transactional workloads. GrASP enhances prefetching accuracy and scalability by leveraging logical block address deltas and combining query representations with result encodings. It frames prefetching as a context-aware multi-label classification task, using multi-layer LSTMs to predict delta patterns from embedded context. This delta modeling approach enables GrASP to generalize predictions from small samples to larger, dynamic datasets without requiring extensive retraining. Experiments on real-world datasets and industrial benchmarks demonstrate that GrASP generalizes to datasets 250 times larger than the training data, achieving up to 45% higher hit ratios, 60% lower I/O time, and 55% lower end-to-end query execution latency than existing baselines. On average, GrASP attains a 91.4% hit ratio, a 90.8% I/O time reduction, and a 57.1% execution latency reduction.
Review of Inference-Time Scaling Strategies: Reasoning, Search and RAG
Wang, Zhichao, Wan, Cheng, Nie, Dong
The performance gains of LLMs have historically been driven by scaling up model size and training data. However, the rapidly diminishing availability of high-quality training data is introducing a fundamental bottleneck, shifting the focus of research toward inference-time scaling. This paradigm uses additional computation at the time of deployment to substantially improve LLM performance on downstream tasks without costly model re-training. This review systematically surveys the diverse techniques contributing to this new era of inference-time scaling, organizing the rapidly evolving field into two comprehensive perspectives: Output-focused and Input-focused methods. Output-focused techniques encompass complex, multi-step generation strategies, including reasoning (e.g., CoT, ToT, ReAct), various search and decoding methods (e.g., MCTS, beam search), training for long CoT (e.g., RLVR, GRPO), and model ensemble methods. Input-focused techniques are primarily categorized by few-shot and RAG, with RAG as the central focus. The RAG section is further detailed through a structured examination of query expansion, data, retrieval and reranker, LLM generation methods, and multi-modal RAG.
The Hybrid Multimodal Graph Index (HMGI): A Comprehensive Framework for Integrated Relational and Vector Search
Chandra, Joydeep, Navneet, Satyam Kumar, Zhang, Yong
The proliferation of complex, multimodal datasets has exposed a critical gap between the capabilities of specialized vector databases and traditional graph databases. While vector databases excel at semantic similarity search, they lack the capacity for deep relational querying. Conversely, graph databases master complex traversals but are not natively optimized for high-dimensional vector search. This paper introduces the Hybrid Multimodal Graph Index (HMGI), a novel framework designed to bridge this gap by creating a unified system for efficient, hybrid queries on multimodal data. HMGI leverages the native graph database architecture and integrated vector search capabilities, exemplified by platforms like Neo4j, to combine Approximate Nearest Neighbor Search (ANNS) with expressive graph traversal queries. Key innovations of the HMGI framework include modality-aware partitioning of embeddings to optimize index structure and query performance, and a system for adaptive, low-overhead index updates to support dynamic data ingestion, drawing inspiration from the architectural principles of systems like TigerVector. By integrating semantic similarity search directly with relational context, HMGI aims to outperform pure vector databases like Milvus in complex, relationship-heavy query scenarios and achieve sub-linear query times for hybrid tasks.