AITopics

Country: Asia > China (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology (0.46)
Energy > Oil & Gas > Upstream (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.77)

Neural Information Processing SystemsMar-19-2025, 15:06:25 GMT

UQE: A Query Engine for Unstructured Databases, Bethany Yixin Wang

Analytics on structured data is a mature field with many successful methods. However, most real world data exists in unstructured form, such as images and conversations. We investigate the potential of Large Language Models (LLMs) to enable unstructured data analytics. In particular, we propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections. This engine accepts queries in a Universal Query Language (UQL), a dialect of SQL that provides full natural language flexibility in specifying conditions and operators. The new engine leverages the ability of LLMs to conduct analysis of unstructured data, while also allowing us to exploit advances in sampling and optimization techniques to achieve efficient and accurate query execution. In addition, we borrow techniques from classical compiler theory to better orchestrate the workflow between sampling methods and foundation model calls. We demonstrate the efficiency of UQE on data analytics across different modalities, including images, dialogs and reviews, across a range of useful query types, including conditional aggregation, semantic retrieval and abstraction aggregation.

artificial intelligence, large language model, natural language, (17 more...)

Country:

Europe (0.67)
North America > United States > Oregon (0.14)
North America > Canada > Alberta (0.14)

Genre:

Research Report > Experimental Study (1.00)
Workflow (0.66)

Industry:

Information Technology (0.93)
Media (0.68)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)

Neural Information Processing SystemsMar-19-2025, 09:54:30 GMT

Optimal Query Complexity of Secure Stochastic Convex Optimization Wei Tang,and Yang Liu

We study the secure stochastic convex optimization problem. A learner aims to learn the optimal point of a convex function through sequentially querying a (stochastic) gradient oracle. In the meantime, there exists an adversary who aims to free-ride and infer the learning outcome of the learner from observing the learner's queries. The adversary observes only the points of the queries but not the feedback from the oracle. The goal of the learner is to optimize the accuracy, i.e., obtaining an accurate estimate of the optimal point, while securing her privacy, i.e., making it difficult for the adversary to infer the optimal point. We formally quantify this tradeoff between learner's accuracy and privacy and characterize the lower and upper bounds on the learner's query complexity as a function of desired levels of accuracy and privacy. For the analysis of lower bounds, we provide a general template based on information theoretical analysis and then tailor the template to several families of problems, including stochastic convex optimization and (noisy) binary search. We also present a generic secure learning protocol that achieves the matching upper bound up to logarithmic factors.

learner, machine learning, natural language, (17 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.90)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.63)

arXiv.org Artificial IntelligenceMar-19-2025

ACE: A Cardinality Estimator for Set-Valued Queries

Sheng, Yufan, Cao, Xin, Zhao, Kaiqi, Fang, Yixiang, Qi, Jianzhong, Zhang, Wenjie, Jensen, Christian S.

Cardinality estimation is a fundamental functionality in database systems. Most existing cardinality estimators focus on handling predicates over numeric or categorical data. They have largely omitted an important data type, set-valued data, which frequently occur in contemporary applications such as information retrieval and recommender systems. The few existing estimators for such data either favor high-frequency elements or rely on a partial independence assumption, which limits their practical applicability. We propose ACE, an Attention-based Cardinality Estimator for estimating the cardinality of queries over set-valued data. We first design a distillation-based data encoder to condense the dataset into a compact matrix. We then design an attention-based query analyzer to capture correlations among query elements. To handle variable-sized queries, a pooling module is introduced, followed by a regression model (MLP) to generate final cardinality estimates. We evaluate ACE on three datasets with varying query element distributions, demonstrating that ACE outperforms the state-of-the-art competitors in terms of both accuracy and efficiency.

artificial intelligence, machine learning, natural language, (21 more...)

2503.14929

Country:

Europe (0.46)
Oceania > Australia (0.28)
North America > United States (0.28)
Asia > China (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
(2 more...)

arXiv.org Artificial IntelligenceMar-19-2025

Pseudo-Relevance Feedback Can Improve Zero-Shot LLM-Based Dense Retrieval

Li, Hang, Wang, Xiao, Koopman, Bevan, Zuccon, Guido

Recent advances in language modelling have been motivated the Pseudo-relevance feedback (PRF) refines queries by leveraging initially replacement of encoder-only backbones like BERT with larger retrieved documents to improve retrieval effectiveness. In this decoder-only backbones (generative LLMs) to form dense representations paper, we investigate how large language models (LLMs) can facilitate [2, 13, 23], allowing to leverage richer contextual information PRF for zero-shot LLM-based dense retrieval, extending the and enhancing dense retrieval generalization. Of particular recently proposed PromptReps method. Specifically, our approach interest for this paper is PromptReps [23], an LLM-based approach uses LLMs to extract salient passage features--such as keywords for dense retrieval. PromptReps is unique in that it does not require and summaries--from top-ranked documents, which are then integrated contrastive learning, producing effective representations for dense into PromptReps to produce enhanced query representations.

large language model, machine learning, natural language, (17 more...)

2503.14887

Country:

Asia (0.68)
Oceania > Australia > Queensland (0.14)
North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Neural Information Processing SystemsMar-18-2025, 23:57:12 GMT

CoADNet: Collaborative Aggregation-and-Distribution Networks for Co-Salient Object Detection

Co-Salient Object Detection (CoSOD) aims at discovering salient objects that repeatedly appear in a given query group containing two or more relevant images. One challenging issue is how to effectively capture co-saliency cues by modeling and exploiting inter-image relationships. In this paper, we present an end-to-end collaborative aggregation-and-distribution network (CoADNet) to capture both salient and repetitive visual patterns from multiple images. First, we integrate saliency priors into the backbone features to suppress the redundant background information through an online intra-saliency guidance structure. After that, we design a two-stage aggregate-and-distribute architecture to explore group-wise semantic interactions and produce the co-saliency features. In the first stage, we propose a group-attentional semantic aggregation module that models inter-image relationships to generate the group-wise semantic representations. In the second stage, we propose a gated group distribution module that adaptively distributes the learned group semantics to different individuals in a dynamic gating mechanism. Finally, we develop a group consistency preserving decoder tailored for the CoSOD task, which maintains group constraints during feature decoding to predict more consistent full-resolution co-saliency maps. The proposed CoADNet is evaluated on four prevailing CoSOD benchmark datasets, which demonstrates the remarkable performance improvement over ten state-of-the-art competitors.

detection, machine learning, natural language, (18 more...)

Country: Asia > China (0.29)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.34)

Neural Information Processing SystemsMar-16-2025, 11:35:37 GMT

UQE: A Query Engine for Unstructured Databases

Analytics on structured data is a mature field with many successful methods.However, most real world data exists in unstructured form, such as images and conversations.We investigate the potential of Large Language Models (LLMs) to enable unstructured data analytics.In particular, we propose a new Universal Query Engine (UQE) that directly interrogates and draws insights from unstructured data collections.This engine accepts queries in a Universal Query Language (UQL), a dialect of SQL that provides full natural language flexibility in specifying conditions and operators.The new engine leverages the ability of LLMs to conduct analysis of unstructured data, while also allowing us to exploit advances in sampling and optimization techniques to achieve efficient and accurate query execution.In addition, we borrow techniques from classical compiler theory to better orchestrate the workflow between sampling methods and foundation model calls.We demonstrate the efficiency of UQE on data analytics across different modalities, including images, dialogs and reviews, across a range of useful query types, including conditional aggregation, semantic retrieval and abstraction aggregation.

large language model, natural language, query engine, (6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.99)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Zeng, Yimeng, Maus, Natalie, Jones, Haydn Thomas, Tao, Jeffrey, Wan, Fangping, Torres, Marcelo Der Torossian, de la Fuente-Nunez, Cesar, Marcus, Ryan, Bastani, Osbert, Gardner, Jacob R.

Large Scale Multi-Task Bayesian Optimization with Large Language Models

arXiv.org Artificial IntelligenceMar-11-2025

In multi-task Bayesian optimization, the goal is to leverage experience from optimizing existing tasks to improve the efficiency of optimizing new ones. While approaches using multi-task Gaussian processes or deep kernel transfer exist, the performance improvement is marginal when scaling to more than a moderate number of tasks. We introduce a novel approach leveraging large language models (LLMs) to learn from, and improve upon, previous optimization trajectories, scaling to approximately 2000 distinct tasks. Specifically, we propose an iterative framework in which an LLM is fine-tuned using the high quality solutions produced by BayesOpt to generate improved initializations that accelerate convergence for future optimization tasks based on previous search trajectories. We evaluate our method on two distinct domains: database query optimization and antimicrobial peptide design. Results demonstrate that our approach creates a positive feedback loop, where the LLM's generated initializations gradually improve, leading to better optimization performance. As this feedback loop continues, we find that the LLM is eventually able to generate solutions to new tasks in just a few shots that are better than the solutions produced by "from scratch" by Bayesian optimization while simultaneously requiring significantly fewer oracle calls.

large language model, natural language, optimization, (16 more...)

2503.08131

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.52)

arXiv.org Artificial IntelligenceMar-11-2025

HiP-AD: Hierarchical and Multi-Granularity Planning with Deformable Attention for Autonomous Driving in a Single Decoder

Tang, Yingqi, Xu, Zhuoran, Meng, Zhaotie, Cheng, Erkang

Although end-to-end autonomous driving (E2E-AD) technologies have made significant progress in recent years, there remains an unsatisfactory performance on closed-loop evaluation. The potential of leveraging planning in query design and interaction has not yet been fully explored. In this paper, we introduce a multi-granularity planning query representation that integrates heterogeneous waypoints, including spatial, temporal, and driving-style waypoints across various sampling patterns. It provides additional supervision for trajectory prediction, enhancing precise closed-loop control for the ego vehicle. Additionally, we explicitly utilize the geometric properties of planning trajectories to effectively retrieve relevant image features based on physical locations using deformable attention. By combining these strategies, we propose a novel end-to-end autonomous driving framework, termed HiP-AD, which simultaneously performs perception, prediction, and planning within a unified decoder. HiP-AD enables comprehensive interaction by allowing planning queries to iteratively interact with perception queries in the BEV space while dynamically extracting image features from perspective views. Experiments demonstrate that HiP-AD outperforms all existing end-to-end autonomous driving methods on the closed-loop benchmark Bench2Drive and achieves competitive performance on the real-world dataset nuScenes.

artificial intelligence, natural language, waypoint, (19 more...)

2503.08612

Genre: Research Report (0.50)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.34)

arXiv.org Artificial IntelligenceMar-10-2025

Mixture of Structural-and-Textual Retrieval over Text-rich Graph Knowledge Bases

Lei, Yongjia, Han, Haoyu, Rossi, Ryan A., Dernoncourt, Franck, Lipka, Nedim, Halappanavar, Mahantesh M, Tang, Jiliang, Wang, Yu

Text-rich Graph Knowledge Bases (TG-KBs) have become increasingly crucial for answering queries by providing textual and structural knowledge. However, current retrieval methods often retrieve these two types of knowledge in isolation without considering their mutual reinforcement and some hybrid methods even bypass structural retrieval entirely after neighboring aggregation. To fill in this gap, we propose a Mixture of Structural-and-Textual Retrieval (MoR) to retrieve these two types of knowledge via a Planning-Reasoning-Organizing framework. In the Planning stage, MoR generates textual planning graphs delineating the logic for answering queries. Following planning graphs, in the Reasoning stage, MoR interweaves structural traversal and textual matching to obtain candidates from TG-KBs. In the Organizing stage, MoR further reranks fetched candidates based on their structural trajectory. Extensive experiments demonstrate the superiority of MoR in harmonizing structural and textual retrieval with insights, including uneven retrieving performance across different query logics and the benefits of integrating structural trajectories for candidate reranking. Our code is available at https://github.com/Yoega/MoR.

artificial intelligence, large language model, natural language, (19 more...)

2502.20317

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.85)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.68)
(2 more...)