AITopics | Query Processing

Collaborating Authors

Query Processing

News Overviews Instructional Materials AI-Alerts Classics

Bienvenu

AAAI ConferencesFeb-8-2022, 12:57:59 GMT

While query answering in the presence of description logic (DL) ontologies is a well-studied problem, questions of static analysis such as query containment and query optimization have received less attention. In this paper, we study a rather general version of query containment that, unlike the classical version, cannot be reduced to query answering. First, we allow a restriction to be placed on the vocabulary used in the instance data, which can result in shorter equivalent queries; and second, we allow each query its own ontology rather than assuming a single ontology for both queries, which is crucial in applications to versioning and modularity. We also study global minimization of queries in the presence of DL ontologies, which is more subtle than for classical databases as minimal queries need not be isomorphic.

description logic, information retrieval query processing, query, (5 more...)

AAAI Conferences

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Description Logic (0.68)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.68)

Add feedback

Libkin

AAAI ConferencesFeb-8-2022, 12:54:32 GMT

The standard way of answering queries over incomplete databases is to compute certain answers, defined as the intersection of query answers on all complete databases that the incomplete database represents. But is this universally accepted definition correct? We argue that this "one-size-fits-all" definition can often lead to counterintuitive or just plain wrong results, and propose an alternative framework for defining certain answers. The idea of the framework is to move away from the standard, in the database literature, assumption that query results be given in the form of a database object, and to allow instead two alternative representations of answers: as objects defining all other answers, or as knowledge we can deduce with certainty about all such answers. We show that the latter is often easier to achieve than the former, that in general certain answers need not be defined as intersection, and may well contain missing information in them. We also show that with a proper choice of semantics, we can often reduce computing certain answers - as either objects or knowledge - to standard query evaluation. We describe the framework in the most general way, applicable to a variety of data models, and test it on three concrete relational semantics of incompleteness: open, closed, and weak closed world.

artificial intelligence, information retrieval query processing, natural language, (6 more...)

AAAI Conferences

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.63)

Add feedback

Reinforcement Learning Based Query Vertex Ordering Model for Subgraph Matching

Wang, Hanchen, Zhang, Ying, Qin, Lu, Wang, Wei, Zhang, Wenjie, Lin, Xuemin

arXiv.org Artificial IntelligenceJan-24-2022

Subgraph matching is a fundamental problem in various fields that use graph structured data. Subgraph matching algorithms enumerate all isomorphic embeddings of a query graph q in a data graph G. An important branch of matching algorithms exploit the backtracking search approach which recursively extends intermediate results following a matching order of query vertices. It has been shown that the matching order plays a critical role in time efficiency of these backtracking based subgraph matching algorithms. In recent years, many advanced techniques for query vertex ordering (i.e., matching order generation) have been proposed to reduce the unpromising intermediate results according to the preset heuristic rules. In this paper, for the first time we apply the Reinforcement Learning (RL) and Graph Neural Networks (GNNs) techniques to generate the high-quality matching order for subgraph matching algorithms. Instead of using the fixed heuristics to generate the matching order, our model could capture and make full use of the graph information, and thus determine the query vertex order with the adaptive learning-based rule that could significantly reduces the number of redundant enumerations. With the help of the reinforcement learning framework, our model is able to consider the long-term benefits rather than only consider the local information at current ordering step.Extensive experiments on six real-life data graphs demonstrate that our proposed matching order generation technique could reduce up to two orders of magnitude of query processing time compared to the state-of-the-art algorithms.

machine learning, reinforcement learning, teaching method, (22 more...)

arXiv.org Artificial Intelligence

2201.11251

Country:

Oceania > Australia > New South Wales (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.51)

Add feedback

Zero-Shot Cost Models for Out-of-the-box Learned Cost Prediction

Hilprecht, Benjamin, Binnig, Carsten

arXiv.org Artificial IntelligenceJan-3-2022

In this paper, we introduce zero-shot cost models which enable learned cost estimation that generalizes to unseen databases. In contrast to state-of-the-art workload-driven approaches which require to execute a large set of training queries on every new database, zero-shot cost models thus allow to instantiate a learned cost model out-of-the-box without expensive training data collection. To enable such zero-shot cost models, we suggest a new learning paradigm based on pre-trained cost models. As core contributions to support the transfer of such a pre-trained cost model to unseen databases, we introduce a new model architecture and representation technique for encoding query workloads as input to those models. As we will show in our evaluation, zero-shot cost estimation can provide more accurate cost estimates than state-of-the-art models for a wide range of (real-world) databases without requiring any query executions on unseen databases. Furthermore, we show that zero-shot cost models can be used in a few-shot mode that further improves their quality by retraining them just with a small number of additional training queries on the unseen database.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2201.00561

Country:

Europe (0.93)
North America > United States (0.47)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)

Add feedback

Baihe: SysML Framework for AI-driven Databases

Pfadler, Andreas, Zhu, Rong, Chen, Wei, Huang, Botong, Zeng, Tianjing, Ding, Bolin, Zhou, Jingren

arXiv.org Artificial IntelligenceDec-29-2021

We present Baihe, a SysML Framework for AI-driven Databases. Using Baihe, an existing relational database system may be retrofitted to use learned components for query optimization or other common tasks, such as e.g. learned structure for indexing. To ensure the practicality and real world applicability of Baihe, its high level architecture is based on the following requirements: separation from the core system, minimal third party dependencies, Robustness, stability and fault tolerance, as well as stability and configurability. Based on the high level architecture, we then describe a concrete implementation of Baihe for PostgreSQL and present example use cases for learned query optimizers. To serve both practitioners, as well as researchers in the DB and AI4DB community Baihe for PostgreSQL will be released under open source license.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2112.1446

Genre: Research Report (0.64)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Shape Fragments

Delva, Thomas, Dimou, Anastasia, Jakubowski, Maxime, Bussche, Jan Van den

arXiv.org Artificial IntelligenceDec-22-2021

In constraint languages for RDF graphs, such as ShEx and SHACL, constraints on nodes and their properties in RDF graphs are known as "shapes". Schemas in these languages list the various shapes that certain targeted nodes must satisfy for the graph to conform to the schema. Using SHACL, we propose in this paper a novel use of shapes, by which a set of shapes is used to extract a subgraph from an RDF graph, the so-called shape fragment. Our proposed mechanism fits in the framework of Linked Data Fragments. In this paper, (i) we define our extraction mechanism formally, building on recently proposed SHACL formalizations; (ii) we establish correctness properties, which relate shape fragments to notions of provenance for database queries; (iii) we compare shape fragments with SPARQL queries; (iv) we discuss implementation options; and (v) we present initial experiments demonstrating that shape fragments are a feasible new idea.

artificial intelligence, information retrieval query processing, natural language, (19 more...)

arXiv.org Artificial Intelligence

2112.11796

Country: Europe > Belgium > Flanders (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Communications > Web > Semantic Web (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.67)

Add feedback

SLOs Made Easier with Nobl9 and Amazon CloudWatch Metrics Insights (Preview)

#artificialintelligenceDec-1-2021, 00:42:51 GMT

Amazon CloudWatch has recently launched Metrics Insights – a fast, flexible, SQL-based query engine that lets customers identify trends and patterns across millions of operational metrics in real time. Metrics Insights allows customers to easily query and analyze metrics to gain better visibility into the health and performance of their infrastructure and large-scale applications. Nobl9 and Amazon Web Services (AWS) have collaborated to extend the existing Nobl9 CloudWatch integration with CloudWatch Metrics Insights (Preview). This will help users to retrieve metrics even faster and gain added flexibility in querying raw service level indicator (SLI) data to use for your SLOs. Nobl9 launched the first version of its CloudWatch integration in September 2021, giving customers a versatile tool to monitor their products.

artificial intelligence, information retrieval query processing, natural language, (15 more...)

#artificialintelligence

Industry:

Retail > Online (0.40)
Information Technology > Services (0.37)

Technology:

Information Technology > Databases (0.60)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.38)

Add feedback

Parallel Logic Programming: A Sequel

Dovier, Agostino, Formisano, Andrea, Gupta, Gopal, Hermenegildo, Manuel V., Pontelli, Enrico, Rocha, Ricardo

arXiv.org Artificial IntelligenceNov-22-2021

Multi-core and highly-connected architectures have become ubiquitous, and this has brought renewed interest in language-based approaches to the exploitation of parallelism. Since its inception, logic programming has been recognized as a programming paradigm with great potential for automated exploitation of parallelism. The comprehensive survey of the first twenty years of research in parallel logic programming, published in 2001, has served since as a fundamental reference to researchers and developers. The contents are quite valid today, but at the same time the field has continued evolving at a fast pace in the years that have followed. Many of these achievements and ongoing research have been driven by the rapid pace of technological innovation, that has led to advances such as very large clusters, the wide diffusion of multi-core processors, the game-changing role of general-purpose graphic processing units, and the ubiquitous adoption of cloud computing. This has been paralleled by significant advances within logic programming, such as tabling, more powerful static analysis and verification, the rapid growth of Answer Set Programming, and in general, more mature implementations and systems. This survey provides a review of the research in parallel logic programming covering the period since 2001, thus providing a natural continuation of the previous survey. The goal of the survey is to serve not only as a reference for researchers and developers of logic programming systems, but also as engaging reading for anyone interested in logic and as a useful source for researchers in parallel systems outside logic programming. Under consideration in Theory and Practice of Logic Programming (TPLP).

artificial intelligence, logic & formal reasoning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2111.11218

Country:

Asia (1.00)
Europe > Germany (0.96)
North America > Canada (0.67)
North America > United States > California > San Francisco County > San Francisco (0.27)

Genre:

Research Report (1.00)
Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.67)

Add feedback

Outlining and Filling: Hierarchical Query Graph Generation for Answering Complex Questions over Knowledge Graph

Chen, Yongrui, Li, Huiying, Qi, Guilin, Wu, Tianxing, Wang, Tenggou

arXiv.org Artificial IntelligenceNov-1-2021

Query graph building aims to build correct executable SPARQL over the knowledge graph for answering natural language questions. Although recent approaches perform well by NN-based query graph ranking, more complex questions bring three new challenges: complicated SPARQL syntax, huge search space for ranking, and noisy query graphs with local ambiguity. This paper handles these challenges. Initially, we regard common complicated SPARQL syntax as the sub-graphs comprising of vertices and edges and propose a new unified query graph grammar to adapt them. Subsequently, we propose a new two-stage approach to build query graphs. In the first stage, the top-$k$ related instances (entities, relations, etc.) are collected by simple strategies, as the candidate instances. In the second stage, a graph generation model performs hierarchical generation. It first outlines a graph structure whose vertices and edges are empty slots, and then fills the appropriate instances into the slots, thereby completing the query graph. Our approach decomposes the unbearable search space of entire query graphs into affordable sub-spaces of operations, meanwhile, leverages the global structural information to eliminate local ambiguity. The experimental results demonstrate that our approach greatly improves state-of-the-art on the hardest KGQA benchmarks and has an excellent performance on complex questions.

artificial intelligence, information retrieval query processing, natural language, (17 more...)

arXiv.org Artificial Intelligence

2111.00732

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)

Add feedback

Phoebe: A Learning-based Checkpoint Optimizer

Zhu, Yiwen, Interlandi, Matteo, Roy, Abhishek, Das, Krishnadhan, Patel, Hiren, Bag, Malay, Sharma, Hitesh, Jindal, Alekh

arXiv.org Artificial IntelligenceOct-5-2021

Easy-to-use programming interfaces paired with cloud-scale processing engines have enabled big data system users to author arbitrarily complex analytical jobs over massive volumes of data. However, as the complexity and scale of analytical jobs increase, they encounter a number of unforeseen problems, hotspots with large intermediate data on temporary storage, longer job recovery time after failures, and worse query optimizer estimates being examples of issues that we are facing at Microsoft. To address these issues, we propose Phoebe, an efficient learning-based checkpoint optimizer. Given a set of constraints and an objective function at compile-time, Phoebe is able to determine the decomposition of job plans, and the optimal set of checkpoints to preserve their outputs to durable global storage. Phoebe consists of three machine learning predictors and one optimization module. For each stage of a job, Phoebe makes accurate predictions for: (1) the execution time, (2) the output size, and (3) the start/end time taking into account the inter-stage dependencies. Using these predictions, we formulate checkpoint optimization as an integer programming problem and propose a scalable heuristic algorithm that meets the latency requirement of the production environment. We demonstrate the effectiveness of Phoebe in production workloads, and show that we can free the temporary storage on hotspots by more than 70% and restart failed jobs 68% faster on average with minimum performance impact. Phoebe also illustrates that adding multiple sets of checkpoints is not cost-efficient, which dramatically reduces the complexity of the optimization.

data mining, machine learning, natural language, (24 more...)

arXiv.org Artificial Intelligence

doi: 10.14778/3476249.3476298

2110.02313

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Databases (1.00)
Information Technology > Data Science > Data Mining > Big Data (1.00)
(4 more...)

Add feedback