Information Retrieval (IR) is concerned with the identification of documents in a collection that are relevant to a given information need, usually represented as a query containing terms or keywords, which are supposed to be a good description of what the user is looking for. IR systems may improve their effectiveness (i.e., increasing the number of relevant documents retrieved) by using a process of query expansion, which automatically adds new terms to the original query posed by an user. In this paper we develop a method of query expansion based on Bayesian networks. Using a learning algorithm, we construct a Bayesian network that represents some of the relationships among the terms appearing in a given document collection; this network is then used as a thesaurus (specific for that collection). We also report the results obtained by our method on three standard test collections.
Answering queries posed over knowledge bases is a central problem in knowledge representation and database theory. In the database area, checking query containment is an important query optimization and schema integration technique. In knowledge representation it has been used for object classification, schema integration, service discovery, and more. In the presence of a knowledge base, the problem of query containment is strictly related to that of query answering; indeed, the two are reducible to each other; we focus on the latter, and our results immediately extend to the former.
Recently developed production systems enable users to specify an appropriate ordering or a clustering of join operations. Various efficiency heuristics have been used to optimize production rules manually. The problem addressed in this paper is how to automatically determine the best join structure for production system programs. Our algorithm is not to directly apply the efficiency heuristics to programs, but rather to enumerate possible join structures under various constraints. Evaluation results demonstrate this algorithm generates a more efficient program than the one obtained by manual optimization.
We consider a generalization of instance retrieval over knowledge bases that provides users with assertions in which descriptions of qualifying objects are given in addition to their identifiers. Notably, this involves a transfer of basic database paradigms involving caching and query rewriting in the context of an assertion retrieval algebra. We present an optimization framework for this algebra, with a focus on finding plans that avoid any need for general knowledge base reasoning at query execution time when sufficient cached results of earlier requests exist.
Conversational interfaces with computers have been the talk of tech since the days of Star Trek. Mostly associated with voice response, frustrating experiences interacting with Siri, chatbots, or the interactive voice response (IVR) systems of call centers reveal what a long slog it's been for getting computers to understand natural language, regardless of whether it's in the form of voice or text. But it took the Amazon Echo's Alexa, which was designed as a conversational voice to Amazon's retail and entertainment services, to show that natural language interfaces could actually perform useful services. When we saw SAS founder Dr. James Goodnight demonstrate how Alexa could be used to query SAS Visual Analytics, we thought that was pretty cool. But when you look at this video, you'll realize that Alexa has only been taught a few things and has a long way to go before it will replace your keyboard or touchpad.