Shen, Yikang (Beihang University) | Rong, Wenge (Beihang University) | Jiang, Nan (Beihang University) | Peng, Baolin (The Chinese University of Hong Kong) | Tang, Jie (Tsinghua University) | Xiong, Zhang (Beihang University)
The large scale of Q&A archives accumulated in community based question answering (CQA) servivces are important information and knowledge resource on the web. Question and answer matching task has been attached much importance to for its ability to reuse knowledge stored in these systems: it can be useful in enhancing user experience with recurrent questions. In this paper, a Word Embedding based Correlation (WEC) model is proposed by integrating advantages of both the translation model and word embedding. Given a random pair of words, WEC can score their co-occurrence probability in Q&A pairs, while it can also leverage the continuity and smoothness of continuous space word representation to deal with new pairs of words that are rare in the training parallel text. An experimental study on Yahoo! Answers dataset and Baidu Zhidao dataset shows this new method's promising potential.
With the development of community based question answering (Q&A) services, a large scale of Q&A archives have been accumulated and are an important information and knowledge resource on the web. Question and answer matching has been attached much importance to for its ability to reuse knowledge stored in these systems: it can be useful in enhancing user experience with recurrent questions. In this paper, we try to improve the matching accuracy by overcoming the lexical gap between question and answer pairs. A Word Embedding based Correlation (WEC) model is proposed by integrating advantages of both the translation model and word embedding, given a random pair of words, WEC can score their co-occurrence probability in Q&A pairs and it can also leverage the continuity and smoothness of continuous space word representation to deal with new pairs of words that are rare in the training parallel text. An experimental study on Yahoo! Answers dataset and Baidu Zhidao dataset shows this new method's promising potential.
Fang, Hanyin (Zhejiang University) | Wu, Fei (Zhejiang University) | Zhao, Zhou (Zhejiang University) | Duan, Xinyu (Zhejiang University) | Zhuang, Yueting (Zhejiang University) | Ester, Martin (Simon Fraser University)
Community-based question answering (cQA) sites have accumulated vast amount of questions and corresponding crowdsourced answers over time. How to efficiently share the underlying information and knowledge from reliable (usually highly-reputable) answerers has become an increasingly popular research topic. A major challenge in cQA tasks is the accurate matching of high-quality answers w.r.t given questions. Many of traditional approaches likely recommend corresponding answers merely depending on the content similarity between questions and answers, therefore suffer from the sparsity bottleneck of cQA data. In this paper, we propose a novel framework which encodes not only the contents of question-answer(Q-A) but also the social interaction cues in the community to boost the cQA tasks. More specifically, our framework collaboratively utilizes the rich interaction among questions, answers and answerers to learn the relative quality rank of different answers w.r.t a same question. Moreover, the information in heterogeneous social networks is comprehensively employed to enhance the quality of question-answering (QA) matching by our deep random walk learning framework. Extensive experiments on a large-scale dataset from a real world cQA site show that leveraging the heterogeneous social information indeed achieves better performance than other state-of-the-art cQA methods.
Zhao, Zhou (Zhejiang University) | Lu, Hanqing (Zhejiang University) | Zheng, Vincent W. (Advanced Digital Sciences Center) | Cai, Deng (Zhejiang University) | He, Xiaofei (Zhejiang University) | Zhuang, Yueting (Zhejiang University)
Nowadays the community-based question answering (CQA) sites become the popular Internet-based web service, which have accumulated millions of questions and their posted answers over time. Thus, question answering becomes an essential problem in CQA sites, which ranks the high-quality answers to the given question. Currently, most of the existing works study the problem of question answering based on the deep semantic matching model to rank the answers based on their semantic relevance, while ignoring the authority of answerers to the given question. In this paper, we consider the problem of community-based question answering from the viewpoint of asymmetric multi-faceted ranking network embedding. We propose a novel asymmetric multi-faceted ranking network learning framework for community-based question answering by jointly exploiting the deep semantic relevance between question-answer pairs and the answerers' authority to the given question. We then develop an asymmetric ranking network learning method with deep recurrent neural networks by integrating both answers' relative quality rank to the given question and the answerers' following relations in CQA sites. The extensive experiments on a large-scale dataset from a real world CQA site show that our method achieves better performance than other state-of-the-art solutions to the problem.
In this paper, we propose solutions to advance answer selection in Community Question Answering (CQA). Unlike previous works, we propose a hybrid attention mechanism to model question-answer pairs. Specifically, for each word, we calculate the intra-sentence attention indicating its local importance and the inter-sentence attention implying its importance to the counterpart sentence. The inter-sentence attention is based on the interactions between question-answer pairs, and the combination of these two attention mechanisms enables us to align the most informative parts in question-answer pairs for sentence matching. Additionally, we exploit user information for answer selection due to the fact that users are more likely to provide correct answers in their areas of expertise. We model users from their written answers to alleviate data sparsity problem, and then learn user representations according to the informative parts in sentences that are useful for question-answer matching task. This mean of modelling users can bridge the semantic gap between different users, as similar users may have the same way of wording their answers. The representations of users, questions and answers are learnt in an end-to-end neural network in a mean that best explains the interrelation between question-answer pairs. We validate the proposed model on a public dataset, and demonstrate its advantages over the baselines with thorough experiments.