Expressive query languages are gaining relevance in knowledge representation (KR), and new reasoning problems come to the fore. Especially query containment is interesting in this context. The problem is known to be decidable for many expressive query languages, but exact complexities are often missing. We introduce a new query language, guarded queries (GQ), which generalizes most known languages where query containment is decidable. GQs can be nested (more expressive), or restricted to linear recursion (less expressive). Our comprehensive analysis of the computational properties and expressiveness of (linear/nested) GQs also yields insights on many previous languages.
Department of Computer Science and Engineering Arizona State University Tempe, AZ 85287 (602) 965-2735 Abstract The most costly aspect of gathering information over the Internet is that of transferring data over the network to answer the user's query. We make two contributions in this paper that alleviate this problem. First, we present an algorithm for reducing the number of information sources in an information gathering (IG) plan by reasoning with localized closed world (LCW) statements. In contrast to previous work on this problem, our algorithm can handle recursive information gathering plans that arise commonly in practice. Second, we present a method for reducing the amount of network traffic generated while executing an information gathering plan by reordering the sequence in which queries are sent to remote information sources. We will explain why a direct application of traditional distributed database methods to this problem does not work, and present a novel and cheap way of adorning source descriptions to assist in ordering the queries. Introduction The explosive growth and popularity of the worldwide web have resulted in thousands of structured queryable information sources on the internet, and the promise of unprecedented information-gathering capabilities to lay users. Unfortunately, the promise has not yet been transformed into reality. While there are sources relevant to virtually any user-queries, the morass of sources presents a formidable hurdle to effectively accessing the information.
We propose an algorithm to reformulate aggregate queries using views in a data integration LAV setting. Our algorithm considers a special case of reformulations where aggregates in the query are expressed as views over aggregates in the view definitions. Although the problem of determining whether two queries are equivalent is undecidable, our algorithm returns an equivalent rewriting if one exists.
The database theory community, centered around the PODS (Principles of Database Systems) conference has had a long-term interest in logic as a way to represent "data, " "information," and "knowledge" (take your pick on the term - it boils down to facts or atoms and rules, usually Horn clauses). The approach of this community has been "slow and steady," preferring to build up carefully from simple special cases to more general ideas, always paying attention to how efficiently we can process queries and perform other operations on the facts and rules. The term Databug has been coined to refer to Prolog-like rules without function symbols, treated as a logic program. That is, Y is an ancestor of X if Y is a parent of X or if there is some 2 that is an ancestor of X and a descendant of Y. Because of the least-fixed-point semantics, there is no question of this program entering a loop, as the corresponding Prolog program would.
The purpose of data integration is to provide a uniform interface to a multitude of data sources. Data integration applications arise frequently as corporations attempt to provide their customers and employees with a consistent view of the data associated with their enterprise. Furthermore, the emergence of XML as a format for data transfer over the worldwide web is making data integration of autonomous, widely distributed sources an imminent reality. A data integration system frees its users from having to locate the sources relevant to their query, interact with each source in isolation, and manually combine the data from the different sources.