Goto

Collaborating Authors

 Question Answering


Towards a Model of Question Generation for Promoting Creativity in Novice Writers

AAAI Conferences

Automated question generation has been explored for a broad range of tasks. However, an important task for which limited work on question generation has been undertaken is writing support. Writing support systems, particularly for novice writers who are acquiring the fundamentals of writing, can scaffold the complex processes that bear on writing. Novice writers face significant challenges in creative writing. Their stories often lack the expressive prose that characterizes texts produced by their expert writer counterparts. A story that is composed by a novice writer may also lack a compelling plot, may not effectively utilize a storyโ€™s setting, characters, and props, and may describe events that play out in an unpredictable or confusing order. We propose an automatic question generation framework that is designed to stimulate the cognitive processes associated with creative writing. The framework utilizes semantic role labeling and discourse parsing applied to the initial drafts of the writerโ€™s passage to generate questions to promote creativity.


Using Automatic Question Generation to Evaluate Questions Generated by Children

AAAI Conferences

This paper shows that automatically generated questions can help classify childrenโ€™s spoken responses to a reading tutor teaching them to generate their own questions. We use automatic question generation to model and classify childrenโ€™s prompted spoken questions about stories. On distinguishing complete and incomplete questions from irrelevant speech and silence, a language model built from automatically generated questions out-performs a trigram language model that does not exploit the structure of questions.


Evaluating HILDA in the CODA Project: A Case Study in Question Generation Using Automatic Discourse Analysis

AAAI Conferences

Recent studies on question generation identify the need for automatic discourse analysers. We evaluated the feasibility of integrating an available discourse analyser called HILDA for a specific question generation system called CODA; introduce an approach by extracting a discourse corpus from the CODA parallel corpus; and identified future work towards automatic discourse analysis in the domain of question generation.


Conjunctive Query Answering for the Description Logic SHIQ

arXiv.org Artificial Intelligence

Conjunctive queries play an important role as an expressive query language for Description Logics (DLs). Although modern DLs usually provide for transitive roles, conjunctive query answering over DL knowledge bases is only poorly understood if transitive roles are admitted in the query. In this paper, we consider unions of conjunctive queries over knowledge bases formulated in the prominent DL SHIQ and allow transitive roles in both the query and the knowledge base. We show decidability of query answering in this setting and establish two tight complexity bounds: regarding combined complexity, we prove that there is a deterministic algorithm for query answering that needs time single exponential in the size of the KB and double exponential in the size of the query, which is optimal. Regarding data complexity, we prove containment in co-NP.


Consistent Query Answering via ASP from Different Perspectives: Theory and Practice

arXiv.org Artificial Intelligence

A data integration system provides transparent access to different data sources by suitably combining their data, and providing the user with a unified view of them, called global schema. However, source data are generally not under the control of the data integration process, thus integrated data may violate global integrity constraints even in presence of locally-consistent data sources. In this scenario, it may be anyway interesting to retrieve as much consistent information as possible. The process of answering user queries under global constraint violations is called consistent query answering (CQA). Several notions of CQA have been proposed, e.g., depending on whether integrated information is assumed to be sound, complete, exact or a variant of them. This paper provides a contribution in this setting: it uniforms solutions coming from different perspectives under a common ASP-based core, and provides query-driven optimizations designed for isolating and eliminating inefficiencies of the general approach for computing consistent answers. Moreover, the paper introduces some new theoretical results enriching existing knowledge on decidability and complexity of the considered problems. The effectiveness of the approach is evidenced by experimental results. To appear in Theory and Practice of Logic Programming (TPLP).


The Complexity of Causality and Responsibility for Query Answers and non-Answers

arXiv.org Artificial Intelligence

An answer to a query has a well-defined lineage expression (alternatively called how-provenance) that explains how the answer was derived. Recent work has also shown how to compute the lineage of a non-answer to a query. However, the cause of an answer or non-answer is a more subtle notion and consists, in general, of only a fragment of the lineage. In this paper, we adapt Halpern, Pearl, and Chockler's recent definitions of causality and responsibility to define the causes of answers and non-answers to queries, and their degree of responsibility. Responsibility captures the notion of degree of causality and serves to rank potentially many causes by their relative contributions to the effect. Then, we study the complexity of computing causes and responsibilities for conjunctive queries. It is known that computing causes is NP-complete in general. Our first main result shows that all causes to conjunctive queries can be computed by a relational query which may involve negation. Thus, causality can be computed in PTIME, and very efficiently so. Next, we study computing responsibility. Here, we prove that the complexity depends on the conjunctive query and demonstrate a dichotomy between PTIME and NP-complete cases. For the PTIME cases, we give a non-trivial algorithm, consisting of a reduction to the max-flow computation problem. Finally, we prove that, even when it is in PTIME, responsibility is complete for LOGSPACE, implying that, unlike causality, it cannot be computed by a relational query.


Analyzing and Predicting Not-Answered Questions in Community-based Question Answering Services

AAAI Conferences

This paper focuses on analyzing and predicting not-answered questions in Community based Question Answering (CQA) services, such as Yahoo! Answers. In CQA services, users express their information needs by submitting natural language questions and await answers from other human users. Comparing to receiving results from web search engines using keyword queries, CQA users are likely to get more specific answers, because human answerers may catch the main point of the question. However, one of the key problems of this pattern is that sometimes no one helps to give answers, while web search engines hardly fail to response. In this paper, we analyze the not-answered questions and give a first try of predicting whether questions will receive answers. More specifically, we first analyze the questions of Yahoo Answers based on the features selected from different perspectives. Then, we formalize the prediction problem as supervised learning โ€“ binary classification problem and leverage the proposed features to make predictions. Extensive experiments are made on 76,251 questions collected from Yahoo! Answers. We analyze the specific characteristics of not-answered questions and try to suggest possible reasons for why a question is not likely to be answered. As for prediction, the experimental results show that classification based on the proposed features outperforms the simple word-based approach significantly.


Leveraging Wikipedia Characteristics for Search and Candidate Generation in Question Answering

AAAI Conferences

Most existing Question Answering (QA) systems adopt a type-and-generate approach to candidate generation that relies on a pre-defined domain ontology. This paper describes a type independent search and candidate generation paradigm for QA that leverages Wikipedia characteristics. This approach is particularly useful for adapting QA systems to domains where reliable answer type identification and type-based answer extraction are not available. We present a three-pronged search approach motivated by relations an answer-justifying title-oriented document may have with the question/answer pair. We further show how Wikipedia metadata such as anchor texts and redirects can be utilized to effectively extract candidate answers from search results without a type ontology. Our experimental results show that our strategies obtained high binary recall in both search and candidate generation on TREC questions, a domain that has mature answer type extraction technology, as well as on Jeopardy! questions, a domain without such technology. Our high-recall search and candidate generation approach has also led to high overall QA performance in Watson, our end-to-end system.


Introduction to the Special Issue on Question Answering

AI Magazine

This special issue issue of AI Magazine presents six articles on some of the most interesting question answering systems in development today. Included are articles on Project, the Semantic Research, Watson, True Knowledge, and TextRunner (University of Washington's clever use of statistical NL techniques to answer questions across the open web).


True Knowledge: Open-Domain Question Answering Using Structured Knowledge and Inference

AI Magazine

This article gives a detailed description of True Knowledge: a commercial, open-domain question answering platform. The system combines a large and growing structured knowledge base of common sense, factual and lexical knowledge; a natural language translation system that turns user questions into internal language-independent queries and an inference system that can answer those queries using both directly represented and inferred knowledge. The system is live and answers millions of questions per month asked by internet users.