Many autonomous and heterogeneous information sources are becoming increasingly available to users through the Internet, especially through the World Wide Web. In order to make the information available in a consolidated, uniform, and efficient manner, it is necessary to integrate the different information sources. The integration of Internet sources poses several challenges that have not been sufficiently addressed by work on the integration of corporate databases residing on an Intranet [LMR90]. We believe that the most important ones are heterogeneity, large number of sources, redundancy, availability, source autonomy, and diverse access methods and querying interfaces.
In this paper, we present an approach to mine large-scale knowledge graphs to discover inference paths for query expansion in NLIDB (Natural Language Interface to Databases). Addressing this problem is important in order for NLIDB applications to effectively handle relevant concepts in the domain of interest that do not correspond to any structured fields in the target database. We also present preliminary observations on the performance of our approach applied to Freebase, and conclude with discussions on next steps to further evaluate and extend our approach.
Within a Question Answering (QA) framework, Question Context plays a vital role. We define Question Context to be background knowledge that can be used to represent the user’s information need more completely than the terms in the query alone. This paper proposes a novel approach that uses statistical language modeling techniques to develop a semantic Question Context which we then incorporate into the Information Retrieval (IR) stage of QA. Our approach proposes an Aspect-Based Relevance Language Model as basis of the Question Context Model. This model proposes that the sparse vocabulary of a query can be supplemented with semantic information from concepts (or aspects) related to query terms that already exist within the corpus. We incorporate the Aspect-Based Relevance Language Model into Question Context by first obtaining all of the latent concepts that exist in the corpus for a particular question topic. Then, we derive a likelihood of relevance that relates each Context Term (CT) associated with those aspects to the user’s query. Context Terms from the topics with the highest likelihood of relevance are then incorporated into the query language model based on their relevance score values. We use both query expansion and document model smoothing techniques and evaluate our approach using the traditional recall metric. Our results are promising and show significant improvements recall at low levels of precision using the query expansion method.
We examine the query planning problem in information integration systems in the presence of sources that contain disjunctive information. We show that datalog, the language of choice for representing query plans in information integration systems, is not sufficiently expressive in this case. We prove that disjunctive datalog with inequality is sufficiently expressive, and present a construction of query plans that are guaranteed to extract all available information from disjunctive sources. 1 Introduction We examine the query planning problem in information integration systems in the presence of sources that contain disjunctive information. The query planning problem in such systems can be formally stated as the problem of answering queries using views (Levy et al. 1995; Ullman 1997; Duschka & Genesereth 1997a): View definitions describe the information stored by sources, and query planning requires to rewrite a query into one that only uses these views. In this paper we are going to extend the algorithm for answermg queries using conjunctive views that was introduced in (Duschka& Genesereth 1997a) to also be able to handle disjunction in the view definitions.
Conversational interfaces with computers have been the talk of tech since the days of Star Trek. Mostly associated with voice response, frustrating experiences interacting with Siri, chatbots, or the interactive voice response (IVR) systems of call centers reveal what a long slog it's been for getting computers to understand natural language, regardless of whether it's in the form of voice or text. But it took the Amazon Echo's Alexa, which was designed as a conversational voice to Amazon's retail and entertainment services, to show that natural language interfaces could actually perform useful services. When we saw SAS founder Dr. James Goodnight demonstrate how Alexa could be used to query SAS Visual Analytics, we thought that was pretty cool. But when you look at this video, you'll realize that Alexa has only been taught a few things and has a long way to go before it will replace your keyboard or touchpad.