Cost-Based Query Optimization via AI Planning

AAAI Conferences

In this paper we revisit the problem of generating query plans using AI automated planning with a view to leveraging significant recent advances in state-of-the-art planning techniques. Our efforts focus on the specific problem of cost-based join-order optimization for conjunctive relational queries, a critical component of production-quality query optimizers. We characterize the general query-planning problem as a delete-free planning problem, and query plan optimization as a context-sensitive cost-optimal planning problem. We propose algorithms that generate high-quality query plans, guaranteeing optimality under certain conditions. Our approach is general, supporting the use of a broad suite of domain-independent and domain-specific optimization criteria. Experimental results demonstrate the effectiveness of AI planning techniques for query plan generation and optimization.


Using Knowledge of Redundancy for Query Optimization in Mediators

AAAI Conferences

Many autonomous and heterogeneous information sources are becoming increasingly available to users through the Internet, especially through the World Wide Web. In order to make the information available in a consolidated, uniform, and efficient manner, it is necessary to integrate the different information sources. The integration of Internet sources poses several challenges that have not been sufficiently addressed by work on the integration of corporate databases residing on an Intranet [LMR90]. We believe that the most important ones are heterogeneity, large number of sources, redundancy, availability, source autonomy, and diverse access methods and querying interfaces.


Query Optimization and Caching

AAAI Conferences

Department of Computer Science York University, Toronto, Canada j arek cs, yorku, ca * Semantic Query Caching Query caching can play a vital role in heterogeneous, multi-database environments. Answers to a query that are available in cache at the local client can be returned to the user quickly, while the rest of the query is evaluated. By caching certain sensitive data locally, caches can be used to answer the parts of queries that involve the sensitive data, so it need not be shipped across the network. Most prior cache schemes have been tuple-based or page-based. It is unclear, however, how these might be adapted for multi-databases.


Multi-Objective Parametric Query Optimization

Communications of the ACM

We propose a generalization of the classical database query optimization problem: multi-objective parametric query (MPQ) optimization. MPQ compares alternative processing plans according to multiple execution cost metrics. It also models missing pieces of information on which plan costs depend upon as parameters. Both features are crucial to model query processing on modern data processing platforms. MPQ generalizes previously proposed query optimization variants, such as multi-objective query optimization, parametric query optimization, and traditional query optimization.


Learning to Speed Up Query Planning in Graph Databases

AAAI Conferences

Querying graph structured data is a fundamental operation that enables important applications including knowledge graph search, social network analysis, and cyber-network security. However, the growing size of real-world data graphs poses severe challenges for graph databases to meet the response-time requirements of the applications. Planning the computational steps of query processing — Query Planning — is central to address these challenges. In this paper, we study the problem of learning to speedup query planning in graph databases towards the goal of improving the computational-efficiency of query processing via training queries. We present a Learning to Plan (L2P) framework that is applicable to a large class of query reasoners that follow the Threshold Algorithm (TA) approach. First, we define a generic search space over candidate query plans, and identify target search trajectories (query plans) corresponding to the training queries by performing an expensive search. Subsequently, we learn greedy search control knowledge to imitate the search behavior of the target query plans. We provide a concrete instantiation of our L2P framework for STAR, a state-of-the-art graph query reasoner. Our experiments on benchmark knowledge graphs including dbpedia, yago, and freebase show that using the query plans generated by the learned search control knowledge, we can significantly improve the speed of STAR with negligible loss in accuracy.