AITopics | Query Processing

Turbocharging Database Query Processing and Testing

Communications of the ACMOct-24-2019, 23:40:23 GMT

Database management systems (DBMS) constitute the backbone of today's information-rich society. A primary reason for the popularity of database systems is their support for declarative queries, typically in the SQL query language. In this programming paradigm, the user only specifies the end objectives, leaving it to the DBMS to automatically identify the optimal execution strategy to achieve these objectives. Declarative specification of queries is also central to parallel query execution in modern big data platforms. Query processing and optimization have been extensively researched for close to five decades now, and are implemented in all contemporary database systems.

big data, information retrieval query processing, query, (19 more...)

Communications of the ACM

Country: Asia > India (0.17)

Technology:

Information Technology > Databases (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.58)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.53)

Add feedback

Query Optimization Properties of Modified VBS

Kłopotek, Mieczysław A., Wierzchoń, Sławomir T.

arXiv.org Artificial IntelligenceSep-26-2019

Valuation-Based System can represent knowledge in different domains including probability theory, Dempster-Shafer theory and possibility theory. More recent studies show that the framework of VBS is also appropriate for representing and solving Bayesian decision problems and optimization problems. In this paper after introducing the valuation based system (VBS) framework, we present Markov-like properties of VBS and a method for resolving queries to VBS. 1 Introduction Though graphical representation of a domain knowledge has quite long history, its full potential has not been recognized until recently. We should mention here pioneering works of J. Pearl, reported in his monography published in 1988 [ 1988] . Further development in this domain has been achieved by Shenoy and Shafer [ 1986 ] who adopted a method used in solving nonserial dynamic programming problems [ Bertele & Brioschi, 1972 ] . This trick proved to be very fruitful and gave growth to a unified framework for uncertainty representation and reasoning, called V aluation-Based System, VBS for short [ Shenoy, 1989 ] .

bayesian inference, information retrieval query processing, valuation, (19 more...)

arXiv.org Artificial Intelligence

1909.12032

Country: North America > United States > Rhode Island (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.64)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)

Add feedback

Optimal query complexity for private sequential learning

Xu, Jiaming, Yang, Dana

arXiv.org Machine LearningSep-21-2019

Motivated by privacy concerns in many practical applications such as Federated Learning, we study a stylized private sequential learning problem: a learner tries to estimate an unknown scalar value, by sequentially querying an external database and receiving binary responses; meanwhile, a third-party adversary observes the learner's queries but not the responses. The learner's goal is to design a querying strategy with the minimum number of queries (optimal query complexity) so that she can accurately estimate the true value, while the adversary even with the complete knowledge of her querying strategy cannot. Prior work has obtained both upper and lower bounds on the optimal query complexity, however, these upper and lower bounds have a large gap in general. In this paper, we construct new querying strategies and prove almost matching upper and lower bounds, providing a complete characterization of the optimal query complexity as a function of the estimation accuracy and the desired levels of privacy.

artificial intelligence, information retrieval query processing, null, (20 more...)

arXiv.org Machine Learning

1909.09836

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases

Yu, Tao, Zhang, Rui, Er, He Yang, Li, Suyi, Xue, Eric, Pang, Bo, Lin, Xi Victoria, Tan, Yi Chern, Shi, Tianze, Li, Zihan, Jiang, Youxuan, Yasunaga, Michihiro, Shim, Sungrok, Chen, Tao, Fabbri, Alexander, Li, Zifan, Chen, Luyao, Zhang, Yuwen, Dixit, Shreya, Zhang, Vincent, Xiong, Caiming, Socher, Richard, Lasecki, Walter S, Radev, Dragomir

arXiv.org Artificial IntelligenceSep-11-2019

It consists of 30k turns plus 10k annotated SQL queries, obtained from a Wizard-of-Oz (WOZ) collection of 3k dialogues querying 200 complex DBs spanning 138 domains. Each dialogue simulates a real-world DB query scenario with a crowd worker as a user exploring the DB and a SQL expert retrieving answers with SQL, clarifying ambiguous questions, or otherwise informing of unanswerable questions. When user questions are answerable by SQL, the expert describes the SQL and execution results to the user, hence maintaining a natural interaction flow. CoSQL introduces new challenges compared to existing task-oriented dialogue datasets: (1) the dialogue states are grounded in SQL, a domain-independent executable representation, instead of domain-specific slot-value pairs, and (2) because testing is done on unseen databases, success requires generalizing to new domains. CoSQL includes three tasks: SQL-grounded dialogue state tracking, response generation from query results, and user dialogue act prediction. We evaluate a set of strong baselines for each task and show that CoSQL presents significant challenges for future research. The dataset, baselines, and leaderboard will be released at https:// yale-lily.github.io/cosql .

computational linguistics, deep learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

1909.05378

Country:

Europe (0.93)
North America > United States > California (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.34)

Add feedback

Nonconvex Zeroth-Order Stochastic ADMM Methods with Lower Function Query Complexity

Huang, Feihu, Gao, Shangqian, Pei, Jian, Huang, Heng

arXiv.org Machine LearningJul-29-2019

Zeroth-order (gradient-free) method is a class of powerful optimization tool for many machine learning problems because it only needs function values (not gradient) in the optimization. In particular, zeroth-order method is very suitable for many complex problems such as black-box attacks and bandit feedback, whose explicit gradients are difficult or infeasible to obtain. Recently, although many zeroth-order methods have been developed, these approaches still exist two main drawbacks: 1) high function query complexity; 2) not being well suitable for solving the problems with complex penalties and constraints. To address these challenging drawbacks, in this paper, we propose a novel fast zeroth-order stochastic alternating direction method of multipliers (ADMM) method (\emph{i.e.}, ZO-SPIDER-ADMM) with lower function query complexity for solving nonconvex problems with multiple nonsmooth penalties. Moreover, we prove that our ZO-SPIDER-ADMM has the optimal function query complexity of $O(dn + dn^{\frac{1}{2}}\epsilon^{-1})$ for finding an $\epsilon$-approximate local solution, where $n$ and $d$ denote the sample size and dimension of data, respectively. In particular, the ZO-SPIDER-ADMM improves the existing best nonconvex zeroth-order ADMM methods by a factor of $O(d^{\frac{1}{3}}n^{\frac{1}{6}})$. Moreover, we propose a fast online ZO-SPIDER-ADMM (\emph{i.e.,} ZOO-SPIDER-ADMM). Our theoretical analysis shows that the ZOO-SPIDER-ADMM has the function query complexity of $O(d\epsilon^{-\frac{3}{2}})$, which improves the existing best result by a factor of $O(\epsilon^{-\frac{1}{2}})$. Finally, we utilize a task of structured adversarial attack on black-box deep neural networks to demonstrate the efficiency of our algorithms.

information retrieval query processing, null 2, optimization problem, (18 more...)

arXiv.org Machine Learning

1907.13463

Country: North America (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)

Add feedback

Cyber-All-Intel: An AI for Security related Threat Intelligence

Mittal, Sudip, Joshi, Anupam, Finin, Tim

arXiv.org Artificial IntelligenceMay-7-2019

Keeping up with threat intelligence is a must for a security analyst today. There is a volume of information present in `the wild' that affects an organization. We need to develop an artificial intelligence system that scours the intelligence sources, to keep the analyst updated about various threats that pose a risk to her organization. A security analyst who is better `tapped in' can be more effective. In this paper we present, Cyber-All-Intel an artificial intelligence system to aid a security analyst. It is a system for knowledge extraction, representation and analytics in an end-to-end pipeline grounded in the cybersecurity informatics domain. It uses multiple knowledge representations like, vector spaces and knowledge graphs in a 'VKG structure' to store incoming intelligence. The system also uses neural network models to pro-actively improve its knowledge. We have also created a query engine and an alert system that can be used by an analyst to find actionable cybersecurity insights.

cyberwarfare, deep learning, vkg structure, (24 more...)

arXiv.org Artificial Intelligence

1905.02895

Country:

North America > United States > Maryland > Baltimore County (0.14)
North America > United States > Maryland > Baltimore (0.14)

Genre: Research Report (0.40)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.59)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(4 more...)

Add feedback

The case for network-accelerated query processing

#artificialintelligenceJan-28-2019, 16:41:39 GMT

Datastores continue to advance on a number of fronts. Some of those that come to mind are adapting to faster networks (e.g. 'FARM: Fast Remote Memory') and persistent memory (see e.g. 'Let's talk about storage and recovery methods for non-volatile memory database systems'), deeply integrating approximate query processing (e.g. Today's paper gives us an exciting look at the untapped potential for network-accelerated query processing.

artificial intelligence, information retrieval query processing, tuple, (16 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.34)

Add feedback

Dynamic Online Gradient Descent with Improved Query Complexity: A Theoretical Revisit

Zhao, Yawei, Zhu, En, Liu, Xinwang, Yin, Jianping

arXiv.org Machine LearningJan-8-2019

We provide a new theoretical analysis framework to investigate online gradient descent in the dynamic environment. Comparing with the previous work, the new framework recovers the state-of-the-art dynamic regret, but does not require extra gradient queries for every iteration. Specifically, when functions are $\alpha$ strongly convex and $\beta$ smooth, to achieve the state-of-the-art dynamic regret, the previous work requires $O(\kappa)$ with $\kappa = \frac{\beta}{\alpha}$ queries of gradients at every iteration. But, our framework shows that the query complexity can be improved to be $O(1)$, which does not depend on $\kappa$. The improvement is significant for ill-conditioned problems because that their objective function usually has a large $\kappa$.

artificial intelligence, assumption, information retrieval query processing, (19 more...)

arXiv.org Machine Learning

1812.10186

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.63)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.62)

Add feedback

Query Complexity of Bayesian Private Learning

Xu, Kuang

Neural Information Processing SystemsDec-31-2018

We study the query complexity of Bayesian Private Learning: a learner wishes to locate a random target within an interval by submitting queries, in the presence of an adversary who observes all of her queries but not the responses. How many queries are necessary and sufficient in order for the learner to accurately estimate the target, while simultaneously concealing the target from the adversary? Our main result is a query complexity lower bound that is tight up to the first order. We show that if the learner wants to estimate the target within an error of $\epsilon$, while ensuring that no adversary estimator can achieve a constant additive error with probability greater than $1/L$, then the query complexity is on the order of $L\log(1/\epsilon)$ as $\epsilon \to 0$. Our result demonstrates that increased privacy, as captured by $L$, comes at the expense of a \emph{multiplicative} increase in query complexity. The proof builds on Fano's inequality and properties of certain proportional-sampling estimators.

information retrieval query processing, learner strategy, survey article, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts (0.14)
North America > United States > California > Santa Clara County (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Education (0.46)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Query Complexity of Bayesian Private Learning

Xu, Kuang

Neural Information Processing SystemsDec-31-2018

We study the query complexity of Bayesian Private Learning: a learner wishes to locate a random target within an interval by submitting queries, in the presence of an adversary who observes all of her queries but not the responses. How many queries are necessary and sufficient in order for the learner to accurately estimate the target, while simultaneously concealing the target from the adversary? Our main result is a query complexity lower bound that is tight up to the first order. We show that if the learner wants to estimate the target within an error of $\epsilon$, while ensuring that no adversary estimator can achieve a constant additive error with probability greater than $1/L$, then the query complexity is on the order of $L\log(1/\epsilon)$ as $\epsilon \to 0$. Our result demonstrates that increased privacy, as captured by $L$, comes at the expense of a \emph{multiplicative} increase in query complexity. The proof builds on Fano's inequality and properties of certain proportional-sampling estimators.

artificial intelligence, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country: