AITopics | Query Processing

Collaborating Authors

Query Processing

News Overviews Instructional Materials AI-Alerts Classics

INODE: Building an End-to-End Data Exploration System in Practice [Extended Vision]

Amer-Yahia, Sihem, Koutrika, Georgia, Bastian, Frederic, Belmpas, Theofilos, Braschler, Martin, Brunner, Ursin, Calvanese, Diego, Fabricius, Maximilian, Gkini, Orest, Kosten, Catherine, Lanti, Davide, Litke, Antonis, Lücke-Tieke, Hendrik, Massucci, Francesco Alessandro, de Farias, Tarcisio Mendes, Mosca, Alessandro, Multari, Francesco, Papadakis, Nikolaos, Papadopoulos, Dimitris, Patil, Yogendra, Personnaz, Aurélien, Rull, Guillem, Sima, Ana, Smith, Ellery, Skoutas, Dimitrios, Subramanian, Srividya, Xiao, Guohui, Stockinger, Kurt

arXiv.org Artificial IntelligenceApr-9-2021

A full-fledged data exploration system must combine different access modalities with a powerful concept of guiding the user in the exploration process, by being reactive and anticipative both for data discovery and for data linking. Such systems are a real opportunity for our community to cater to users with different domain and data science expertise. We introduce INODE -- an end-to-end data exploration system -- that leverages, on the one hand, Machine Learning and, on the other hand, semantics for the purpose of Data Management (DM). Our vision is to develop a classic unified, comprehensive platform that provides extensive access to open datasets, and we demonstrate it in three significant use cases in the fields of Cancer Biomarker Reearch, Research and Innovation Policy Making, and Astrophysics. INODE offers sustainable services in (a) data modeling and linking, (b) integrated query processing using natural language, (c) guidance, and (d) data exploration through visualization, thus facilitating the user in discovering new insights. We demonstrate that our system is uniquely accessible to a wide range of users from larger scientific communities to the public. Finally, we briefly illustrate how this work paves the way for new research opportunities in DM.

information management, information retrieval query processing, oncology, (18 more...)

arXiv.org Artificial Intelligence

2104.04194

Country:

Europe (1.00)
North America > United States > California (0.14)
North America > Canada > Ontario (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
(2 more...)

Add feedback

Push-down query capabilities: Five questions to ask your cloud BI provider

ZDNetMar-11-2021, 20:59:16 GMT

Software-as-a-service (SaaS) offers many benefits, including but not limited to elasticity: the ability to shrink and grow storage and compute resources on demand. Clients of most leading enterprise business intelligence (BI) platforms enjoy this cloud elasticity benefit but at a cost. Ultimately, elasticity requires both application and data components (compute and store) to be elastic, and therefore, cloud-native BI platforms require that on-premises data be ingested into the cloud platform before it can be analyzed. But not all organizations are ready to let go of their data from inside their firewalls, and they are not ready to commit to a single cloud provider -- most are opting for a hybrid on-premises and multicloud environment. Here's a look at how the cloud leaders stack up, the hybrid market, and the SaaS players that run your company as well as their latest strategic moves.

artificial intelligence, information retrieval query processing, platform, (14 more...)

ZDNet

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Communications > Web (0.79)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.44)

Add feedback

A Knowledge Compilation Map for Conditional Preference Statements-based Languages

Fargier, Hélène, Mengin, Jérôme

arXiv.org Artificial IntelligenceFeb-8-2021

Conditional preference statements have been used to compactly represent preferences over combinatorial domains. They are at the core of CP-nets and their generalizations, and lexicographic preference trees. Several works have addressed the complexity of some queries (optimization, dominance in particular). We extend in this paper some of these results, and study other queries which have not been addressed so far, like equivalence, thereby contributing to a knowledge compilation map for languages based on conditional preference statements. We also introduce a new parameterised family of languages, which enables to balance expressiveness against the complexity of some queries.

constraint-based reasoning, information retrieval query processing, var, (19 more...)

arXiv.org Artificial Intelligence

2102.04107

Country: North America > Canada > Quebec (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.34)

Add feedback

DP-Cryptography

Communications of the ACMJan-26-2021, 13:21:32 GMT

On Feb 15, 2019, John Abowd, chief scientist at the U.S. Census Bureau, announced the results of a reconstruction attack that they proactively launched using data released under the 2010 Decennial Census.19 The decennial census released billions of statistics about individuals like "how many people of the age 10-20 live in New York City" or "how many people live in four-person households." Using only the data publicly released in 2010, an internal team was able to correctly reconstruct records of address (by census block), age, gender, race, and ethnicity for 142 million people (about 46% of the U.S. population), and correctly match these data to commercial datasets circa 2010 to associate personal-identifying information such as names for 52 million people (17% of the population). This is not specific to the U.S. Census Bureau--such attacks can occur in any setting where statistical information in the form of deidentified data, statistics, or even machine learning models are released. That such attacks are possible was predicted over 15 years ago by a seminal paper by Irit Dinur and Kobbi Nissim12--releasing a sufficiently large number of aggregate statistics with sufficiently high accuracy provides sufficient information to reconstruct the underlying database with high accuracy. The practicality of such a large-scale reconstruction by the U.S. Census Bureau underscores the grand challenge that public organizations, industry, and scientific research faces: How can we safely disseminate results of data analysis on sensitive databases? An emerging answer is differential privacy. An algorithm satisfies differential privacy (DP) if its output is insensitive to adding, removing or changing one record in its input database. DP is considered the "gold standard" for privacy for a number of reasons. It provides a persuasive mathematical proof of privacy to individuals with several rigorous interpretations.25,26 The DP guarantee is composable and repeating invocations of differentially private algorithms lead to a graceful degradation of privacy.

information retrieval query processing, privacy, us government, (18 more...)

Communications of the ACM

Country:

North America > United States > New York (0.24)
North America > United States > California > Alameda County > Berkeley (0.14)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Databases (1.00)
Information Technology > Data Science (1.00)
(2 more...)

Add feedback

Context-Aware Target Apps Selection and Recommendation for Enhancing Personal Mobile Assistants

Aliannejadi, Mohammad, Zamani, Hamed, Crestani, Fabio, Croft, W. Bruce

arXiv.org Artificial IntelligenceJan-9-2021

Users install many apps on their smartphones, raising issues related to information overload for users and resource management for devices. Moreover, the recent increase in the use of personal assistants has made mobile devices even more pervasive in users' lives. This paper addresses two research problems that are vital for developing effective personal mobile assistants: target apps selection and recommendation. The former is the key component of a unified mobile search system: a system that addresses the users' information needs for all the apps installed on their devices with a unified mode of access. The latter, instead, predicts the next apps that the users would want to launch. Here we focus on context-aware models to leverage the rich contextual information available to mobile devices. We design an in situ study to collect thousands of mobile queries enriched with mobile sensor data (now publicly available for research purposes). With the aid of this dataset, we study the user behavior in the context of these tasks and propose a family of context-aware neural models that take into account the sequential, temporal, and personal behavior of users. We study several state-of-the-art models and show that the proposed models significantly outperform the baselines.

app, deep learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

2101.03394

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre:

Research Report > New Finding (0.92)
Research Report > Experimental Study (0.92)

Industry:

Media (0.93)
Information Technology > Services (0.92)
Information Technology > Security & Privacy (0.66)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Communications > Mobile (1.00)
(4 more...)

Add feedback

Modeling Global Semantics for Question Answering over Knowledge Bases

Wu, Peiyun, Wu, Yunjie, Wu, Linjuan, Zhang, Xiaowang, Feng, Zhiyong

arXiv.org Artificial IntelligenceJan-5-2021

Semantic parsing, as an important approach However, the state-of-the-art semantic parsing approaches to question answering over knowledge bases utilize relational semantics of query graphs with pay little attention (KBQA), transforms a question into the complete to the structure semantics of a question. The structure query graph for further generating the correct logical semantics is an important part of the whole semantics query. Existing semantic parsing approaches of questions (e.g., Figure 1), especially in complex questions mainly focus on relations matching with paying where the complexity of a question often relies on its complicated less attention to the underlying internal structure structure. As a result, existing works only consider relational of questions (e.g., the dependencies and relations semantics cannot always perform complex questions between all entities in a question) to select the better. So it is necessary to pay more attention to the structure query graph. In this paper, we present a relational semantics of questions together with relational semantics graph convolutional network (RGCN)-based model when semantic parsing in KBQA. However, to model multirelational gRGCN for semantic parsing in KBQA.

artificial intelligence, information retrieval query processing, natural language, (17 more...)

arXiv.org Artificial Intelligence

2101.0151

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.82)

Add feedback

A Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation, Cost Model, and Plan Enumeration

Lan, Hai, Bao, Zhifeng, Peng, Yuwei

arXiv.org Artificial IntelligenceJan-5-2021

Query optimizer is at the heart of the database systems. Cost-based optimizer studied in this paper is adopted in almost all current database systems. A cost-based optimizer introduces a plan enumeration algorithm to find a (sub)plan, and then uses a cost model to obtain the cost of that plan, and selects the plan with the lowest cost. In the cost model, cardinality, the number of tuples through an operator, plays a crucial role. Due to the inaccuracy in cardinality estimation, errors in cost model, and the huge plan space, the optimizer cannot find the optimal execution plan for a complex query in a reasonable time. In this paper, we first deeply study the causes behind the limitations above. Next, we review the techniques used to improve the quality of the three key components in the cost-based optimizer, cardinality estimation, cost model, and plan enumeration. We also provide our insights on the future directions for each of the above aspects.

deep learning, neural network, query, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s41019-020-00149-7

2101.01507

Country:

Asia (0.46)
Europe > Austria > Vienna (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(4 more...)

Add feedback

Brain-inspired Search Engine Assistant based on Knowledge Graph

Zhao, Xuejiao, Chen, Huanhuan, Xing, Zhenchang, Miao, Chunyan

arXiv.org Artificial IntelligenceDec-25-2020

Search engines can quickly response a hyperlink list according to query keywords. However, when a query is complex, developers need to repeatedly refine the search keywords and open a large number of web pages to find and summarize answers. Many research works of question and answering (Q and A) system attempt to assist search engines by providing simple, accurate and understandable answers. However, without original semantic contexts, these answers lack explainability, making them difficult for users to trust and adopt. In this paper, a brain-inspired search engine assistant named DeveloperBot based on knowledge graph is proposed, which aligns to the cognitive process of human and has the capacity to answer complex queries with explainability. Specifically, DeveloperBot firstly constructs a multi-layer query graph by splitting a complex multi-constraint query into several ordered constraints. Then it models the constraint reasoning process as subgraph search process inspired by the spreading activation model of cognitive science. In the end, novel features of the subgraph will be extracted for decision-making. The corresponding reasoning subgraph and answer confidence will be derived as explanations. The results of the decision-making demonstrate that DeveloperBot can estimate the answers and answer confidences with high accuracy. We implement a prototype and conduct a user study to evaluate whether and how the direct answers and the explanations provided by DeveloperBot can assist developers' information needs.

developerbot, information retrieval query processing, neural network, (20 more...)

arXiv.org Artificial Intelligence

2012.13529

Country:

North America > United States (0.28)
Europe > Middle East > Malta (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.36)

Add feedback

FLAT: Fast, Lightweight and Accurate Method for Cardinality Estimation

Zhu, Rong, Wu, Ziniu, Han, Yuxing, Zeng, Kai, Pfadler, Andreas, Qian, Zhengping, Zhou, Jingren, Cui, Bin

arXiv.org Artificial IntelligenceNov-17-2020

Query optimizers rely on accurate cardinality estimation (CardEst) to produce good execution plans. The core problem of CardEst is how to model the rich joint distribution of attributes in an accurate and compact manner. Despite decades of research, existing methods either over simplify the models only using independent factorization which leads to inaccurate estimates and sub optimal query plans, or over-complicate them by lossless conditional factorization without any independent assumption which results in slow probability computation. In this paper, we propose FLAT, a CardEst method that is simultaneously fast in probability computation, lightweight in model size and accurate in estimation quality. The key idea of FLAT is a novel unsupervised graphical model, called FSPN. It utilizes both independent and conditional factorization to adaptively model different levels of attributes correlations, and thus subsumes all existing CardEst models and dovetails their advantages. FLAT supports efficient online probability computation in near liner time on the underlying FSPN model, and provides effective offline model construction. It can estimate cardinality for both single table queries and multi-table join queries. Extensive experimental study demonstrates the superiority of FLAT over existing CardEst methods on well-known benchmarks: FLAT achieves 1 to 5 orders of magnitude better accuracy, 1 to 3 orders of magnitude faster probability computation speed (around 0.2ms) and 1 to 2 orders of magnitude lower storage cost (only tens of KB).

bayesian inference, information retrieval query processing, node, (19 more...)

arXiv.org Artificial Intelligence

2011.09022

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
(2 more...)

Add feedback

Active Classification with Uncertainty Comparison Queries

Cui, Zhenghang, Sato, Issei

arXiv.org Machine LearningOct-28-2020

Noisy pairwise comparison feedback has been incorporated to improve the overall query complexity of interactively learning binary classifiers. The \textit{positivity comparison oracle} is used to provide feedback on which is more likely to be positive given a pair of data points. Because it is impossible to infer accurate labels using this oracle alone \textit{without knowing the classification threshold}, existing methods still rely on the traditional \textit{explicit labeling oracle}, which directly answers the label given a data point. Existing methods conduct sorting on all data points and use explicit labeling oracle to find the classification threshold. The current methods, however, have two drawbacks: (1) they needs unnecessary sorting for label inference; (2) quick sort is naively adapted to noisy feedback and negatively affects practical performance. In order to avoid this inefficiency and acquire information of the classification threshold, we propose a new pairwise comparison oracle concerning uncertainties. This oracle receives two data points as input and answers which one has higher uncertainty. We then propose an efficient adaptive labeling algorithm using the proposed oracle and the positivity comparison oracle. In addition, we also address the situation where the labeling budget is insufficient compared to the dataset size, which can be dealt with by plugging the proposed algorithm into an active learning algorithm. Furthermore, we confirm the feasibility of the proposed oracle and the performance of the proposed algorithm theoretically and empirically.

algorithm, artificial intelligence, information retrieval query processing, (19 more...)

arXiv.org Machine Learning

2008.00645

Country:

North America > United States > New York (0.14)
North America > United States > California (0.14)

Genre: Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.35)

Add feedback