Huang, Yalou
Hierarchical Recurrent Attention Network for Response Generation
Xing, Chen (College of Computer and Control Engineering, College of Software, Nankai University, Tianjin) | Wu, Yu (Beihang University, Beijing) | Wu, Wei (Microsoft Research) | Huang, Yalou (College of Computer and Control Engineering, College of Software, Nankai University, Tianjin) | Zhou, Ming (Microsoft Research)
We study multi-turn response generation in chatbots where a response is generated according to a conversation context. Existing work has modeled the hierarchy of the context, but does not pay enough attention to the fact that words and utterances in the context are differentially important. As a result, they may lose important information in context and generate irrelevant responses. We propose a hierarchical recurrent attention network (HRAN) to model both the hierarchy and the importance variance in a unified framework. In HRAN, a hierarchical attention mechanism attends to important parts within and among utterances with word level attention and utterance level attention respectively. Empirical studies on both automatic evaluation and human judgment show that HRAN can significantly outperform state-of-the-art models for context based response generation.
Topic Aware Neural Response Generation
Xing, Chen (Nankai University) | Wu, Wei (Microsoft Research Asia) | Wu, Yu (Beihang University) | Liu, Jie (Nankai University) | Huang, Yalou (Nankai University) | Zhou, Ming (Microsoft Research Asia) | Ma, Wei-Ying (Microsoft Research Asia)
We consider incorporating topic information into a sequence-to-sequence framework to generate informative and interesting responses for chatbots. To this end, we propose a topic aware sequence-to-sequence (TA-Seq2Seq) model. The model utilizes topics to simulate prior human knowledge that guides them to form informative and interesting responses in conversation, and leverages topic information in generation by a joint attention mechanism and a biased generation probability. The joint attention mechanism summarizes the hidden vectors of an input message as context vectors by message attention and synthesizes topic vectors by topic attention from the topic words of the message obtained from a pre-trained LDA model, with these vectors jointly affecting the generation of words in decoding. To increase the possibility of topic words appearing in responses, the model modifies the generation probability of topic words by adding an extra probability item to bias the overall distribution. Empirical studies on both automatic evaluation metrics and human annotations show that TA-Seq2Seq can generate more informative and interesting responses, significantly outperforming state-of-the-art response generation models.
Hashtag-Based Sub-Event Discovery Using Mutually Generative LDA in Twitter
Xing, Chen (Nankai University) | Wang, Yuan (Nankai University) | Liu, Jie (Nankai University) | Huang, Yalou (Nankai University) | Ma, Wei-Ying (Microsoft Research, China)
Sub-event discovery is an effective method for social event analysis in Twitter. It can discover sub-events from large amount of noisy event-related information in Twitter and semantically represent them. The task is challenging because tweets are short, informal and noisy. To solve this problem, we consider leveraging event-related hashtags that contain many locations, dates and concise sub-event related descriptions to enhance sub-event discovery. To this end, we propose a hashtag-based mutually generative Latent Dirichlet Allocation model(MGe-LDA). In MGe-LDA, hashtags and topics of a tweet are mutually generated by each other. The mutually generative process models the relationship between hashtags and topics of tweets, and highlights the role of hashtags as a semantic representation of the corresponding tweets. Experimental results show that MGe-LDA can significantly outperform state-of-the-art methods for sub-event discovery.