Yahoo! Research Latin America
Learning to Rank Effective Paraphrases from Query Logs for Community Question Answering
Figueroa, Alejandro (Yahoo! Research Latin America) | Neumann, Guenter (DFKI)
We present a novel method for ranking query paraphrases for effective search in community question answering (cQA). The method uses query logs from Yahoo! Search and Yahoo! Answers for automatically extracting a corpus of paraphrases of queries and questions using the query-question click history. Elements of this corpus are automatically ranked according to recall and mean reciprocal rank, and then used for learning two independent learning to rank models (SVMRank), whereby a set of new query paraphrases can be scored according to recall and MRR. We perform several automatic evaluation procedures using cross-validation for analyzing the behavior of various aspects of our learned ranking functions, which show that our method is useful and effective for search in cQA.
Maximum Entropy Context Models for Ranking Biographical Answers to Open-Domain Definition Questions
Figueroa, Alejandro (Yahoo! Research Latin America) | Atkinson, John (Universidad de Concepcion)
In the context of question-answering systems, there are several strategies for scoring candidate answers to definition queries including centroid vectors, bi-term and context language models. These techniques use only positive examples (i.e., descriptions) when building their models. In this work, a maximum entropy based extension is proposed for context language models so as to account for regularities across non-descriptions mined from web-snippets. Experiments show that this extension outperforms other strategies increasing the precision of the top five ranked answers by more than 5%. Results suggest that web-snippets are a cost-efficient source of non-descriptions, and that some relationships extracted from dependency trees are effective to mine for candidate answer sentences.