Goto

Collaborating Authors

 Question Answering


Answer Extraction from Passage Graph for Question Answering

AAAI Conferences

In question answering, answer extraction aims topin-point the exact answer from passages. However,most previous methods perform such extractionon each passage separately, without consideringclues provided in other passages. This paperpresents a novel approach to extract answers byfully leveraging connections among different passages.Specially, extraction is performed on a PassageGraph which is built by adding links uponmultiple passages. Different passages are connectedby linking words with the same stem. Weuse the factor graph as our model for answer extraction.Experimental results on multiple QA datasets demonstrate that our method significantly improvesthe performance of answer extraction.


The Impact of Disjunction on Query Answering Under Guarded-Based Existential Rules

AAAI Conferences

We study the complexity of conjunctive query answering under (weakly-)(frontier-)guarded disjunctive existential rules, i.e., existential rules extended with disjunction, and their main subclasses, linear rules and inclusion dependencies (IDs). Our main result states that conjunctive query answering under a fixed set of disjunctive IDs is 2EXPTIME-hard. This quite surprising result together with a 2EXPTIME upper bound for weakly-frontier-guarded disjunctive rules, obtained by exploiting recent results on guarded negation first-order logic, gives us a complete picture of the computational complexity of our problem. We also consider a natural subclass of disjunctive IDs, namely frontier-one (only one variable is propagated), for which the combined complexity decreases to EXPTIME. Finally, we show that frontier-guarded rules, combined with negative constraints, are strictly more expressive than DL-Lite H bool , one of the most expressive languages of the DL-Lite family. We also show that query answering under this DL is 2EXPTIME-complete in combined complexity.


USI Answers: Natural Language Question Answering Over (Semi-) Structured Industry Data

AAAI Conferences

The paper reports on the progress towards the goal of offering easy access to enterprise data to a large number of business users, most of whom are not familiar with the specific syntax or semantics of the underlying data sources. Additional complications come from the nature of the data, which comes both as structured and unstructured. The proposed solution allows users to express questions in natural language, makes apparent the system's interpretation of the query, and allows easy query adjustment and reformulation. The application is in use by more than 1500 users from Siemens Energy. We evaluate our approach on a data set consisting of fleet data.


Introducing Nominals to the Combined Query Answering Approaches for EL

AAAI Conferences

So-called combined approaches answer a conjunctive query over a description logic ontology in three steps: first, they materialise certain consequences of the ontology and the data; second, they evaluate the query over the data; and third, they filter the result of the second phase to eliminate unsound answers. Such approaches were developed for various members of the DL-Lite and the EL families of languages, but none of them can handle ontologies containing nominals. In our work, we bridge this gap and present a combined query answering approach for ELHO--a logic that contains all features of the OWL 2 EL standard apart from transitive roles and complex role inclusions. This extension is nontrivial because nominals require equality reasoning, which introduces complexity into the first and the third step. Our empirical evaluation suggests that our technique is suitable for practical application, and so it provides a practical basis for conjunctive query answering in a large fragment of OWL 2 EL.


Learning to Rank Effective Paraphrases from Query Logs for Community Question Answering

AAAI Conferences

We present a novel method for ranking query paraphrases for effective search in community question answering (cQA). The method uses query logs from Yahoo! Search and Yahoo! Answers for automatically extracting a corpus of paraphrases of queries and questions using the query-question click history. Elements of this corpus are automatically ranked according to recall and mean reciprocal rank, and then used for learning two independent learning to rank models (SVMRank), whereby a set of new query paraphrases can be scored according to recall and MRR. We perform several automatic evaluation procedures using cross-validation for analyzing the behavior of various aspects of our learned ranking functions, which show that our method is useful and effective for search in cQA.


Booming Up the Long Tails: Discovering Potentially Contributive Users in Community-Based Question Answering Services

AAAI Conferences

Community-based question answering (CQA) services such as Yahoo! Answers have been widely used by Internet users to get the answers for their inquiries. The CQA services totally rely on the contributions by the users. However, it is known that newcomers are prone to lose their interests and leave the communities. Thus, finding expert users in an early phase when they are still active is essential to improve the chances of motivating them to contribute to the communities further. In this paper, we propose a novel approach to discovering "potentially" contributive users from recently-joined users in CQA services. The likelihood of becoming a contributive user is defined by the user's expertise as well as availability, which we call the answer affordance. The main technical difficulty lies in the fact that such recently-joined users do not have abundant information accumulated for many years. We utilize a user's productive vocabulary to mitigate the lack of available information since the vocabulary is the most fundamental element that reveals his/her knowledge. Extensive experiments were conducted with a huge data set of Naver Knowledge-In (KiN), which is the dominating CQA service in Korea. We demonstrate that the top rankers selected by the answer affordance outperformed those by KiN in terms of the amount of answering activity.


Towards Predicting the Best Answers in Community-based Question-Answering Services

AAAI Conferences

Community-based question-answering (CQA) services contribute to solving many difficult questions we have. For each question in such services, one best answer can be designated, among all answers, often by the asker. However, many questions on typical CQA sites are left without a best answer even if when good candidates are available. In this paper, we attempt to address the problem of predicting if an answer may be selected as the best answer, based on learning from labeled data. The key tasks include designing features measuring important aspects of an answer and identifying the most importance features. Experiments with a Stack Overflow dataset show that the contextual information among the answers should be the most important factor to consider.


Human Judgment on Humor Expressions in a Community-Based Question-Answering Service

AAAI Conferences

For understanding humorous dialogue, a collection of humorous expressions is needed. In addition to humorous expressions, their annotations are important to be used as language resources. In this paper, we analyzed how human assessors annotate humorous expressions extracted from an online community-based question-answering (CQA) corpus, which contains many interesting examples of humorous communication. We analyzed the annotation results of a collection of humorous expressions as done by 28 annotators in terms of the degree of humor and categorization of humor. We found the assessments to be quite subjective, and only marginal inter-annotator agreements were observed. This result suggests that the variability in humor annotations is not noise resulting from erroneous assessment but is rooted in personality differences of the annotators. It would be necessary to incorporate the individual differences in humor perception for properly utilizing the resources. We discuss the possibility to improve the collection process by applying filtering techniques.


BioASQ: A Challenge on Large-Scale Biomedical Semantic Indexing and Question Answering

AAAI Conferences

This article provides an overview of BioASQ, a new competition on biomedical semantic indexing and question answering (QA). BioASQ aims to push towards systems that will allow biomedical workers to express their information needs in natural language and that will return concise and user-understandable answers by combining information from multiple sources of different kinds, including biomedical articles, databases, and ontologies. BioASQ encourages participants to adopt semantic indexing as a means to combine multiple information sources and to facilitate the matching of questions to answers. It also adopts a broad semantic indexing and QA architecture that subsumes current relevant approaches, even though no current system instantiates all of its components. Hence, the architecture can also be seen as our view of how relevant work from fields such as information retrieval, hierarchical classification, question answering, ontologies, and linked data can be combined, extended, and applied to biomedical question answering. BioASQ will develop publicly available benchmarks and it will adopt and possibly refine existing evaluation measures. The evaluation infrastructure of the competition will remain publicly available beyond the end of BioASQ.


A Data-Driven Approach to Question Subjectivity Identification in Community Question Answering

AAAI Conferences

Automatic Subjective Question Answering (ASQA), which aims at answering users'subjective questions using summaries of multiple opinions, becomes increasingly important. One challenge of ASQA is that expected answers for subjective questions may not readily exist in the Web. The rising and popularity of Community Question Answering (CQA) sites, which provide platforms for people to post and answer questions, provides an alternative to ASQA. One important task of ASQA is question subjectivity identification, which identifies whether a user is asking a subjective question. Unfortunately, there has been little labeled training data available for this task. In this paper, we propose an approach to collect training data automatically by utilizing social signals in CQA sites without involving any manual labeling. Experimental results show that our data-driven approach achieves 9.37% relative improvement over the supervised approach using manually labeled data, and achieves 5.15% relative gain over a state-of-the-art semi-supervised approach. In addition, we propose several heuristic features for question subjectivity identification. By adding these features, we achieve 11.23% relative improvement over word n-gram feature under the same experimental setting.