Discourse & Dialogue
Improving Review Representations With User Attention and Product Attention for Sentiment Classification
Wu, Zhen (Nanjing University) | Dai, Xin-Yu (Nanjing University) | Yin, Cunyan (Nanjing University) | Huang, Shujian (Nanjing University) | Chen, Jiajun (Nanjing University)
Neural network methods have achieved great success in reviews sentiment classification. Recently, some works achieved improvement by incorporating user and product information to generate a review representation. However, in reviews, we observe that some words or sentences show strong user's preference, and some others tend to indicate product's characteristic. The two kinds of information play different roles in determining the sentiment label of a review. Therefore, it is not reasonable to encode user and product information together into one representation. In this paper, we propose a novel framework to encode user and product information. Firstly, we apply two individual hierarchical neural networks to generate two representations, with user attention or with product attention. Then, we design a combined strategy to make full use of the two representations for training and final prediction. The experimental results show that our model obviously outperforms other state-of-the-art methods on IMDB and Yelp datasets. Through the visualization of attention over words related to user or product, we validate our observation mentioned above.
Argument Mining for Improving the Automated Scoring of Persuasive Essays
Nguyen, Huy V. (University of Pittsburgh) | Litman, Diane J. (University of Pittsburgh)
End-to-end argument mining has enabled the development of new automated essay scoring (AES) systems that use argumentative features (e.g., number of claims, number of support relations) in addition to traditional legacy features (e.g., grammar, discourse structure) when scoring persuasive essays. While prior research has proposed different argumentative features as well as empirically demonstrated their utility for AES, these studies have all had important limitations. In this paper we identify a set of desiderata for evaluating the use of argument mining for AES, introduce an end-to-end argument mining system and associated argumentative feature sets, and present the results of several studies that both satisfy the desiderata and demonstrate the value-added of argument mining for scoring persuasive essays.
Cognition-Cognizant Sentiment Analysis With Multitask Subjectivity Summarization Based on Annotators' Gaze Behavior
Mishra, Abhijit (IBM Research AI ) | Tamilselvam, Srikanth (IBM Research AI ) | Dasgupta, Riddhiman (IBM Research AI ) | Nagar, Seema (IBM Research AI ) | Dey, Kuntal (IBM Research AI )
For document level sentiment analysis (SA), Subjectivity Extraction, ie., extracting the relevant subjective portions of the text that cover the overall sentiment expressed in the document, is an important step. Subjectivity Extraction, however, is a hard problem for systems, as it demands a great deal of world knowledge and reasoning. Humans, on the other hand, are good at extracting relevant subjective summaries from an opinionated document (say, a movie review), while inferring the sentiment expressed in it. This capability is manifested in their eye-movement behavior while reading: words pertaining to the subjective summary of the text attract a lot more attention in the form of gaze-fixations and/or saccadic patterns. We propose a multi-task deep neural framework for document level sentiment analysis that learns to predict the overall sentiment expressed in the given input document, by simultaneously learning to predict human gaze behavior and auxiliary linguistic tasks like part-of-speech and syntactic properties of words in the document. For this, a multi-task learning algorithm based on multi-layer shared LSTM augmented with task specific classifiers is proposed. With this composite multi-task network, we obtain performance competitive with or better than state-of-the-art approaches in SA. Moreover, the availability of gaze predictions as an auxiliary output helps interpret the system better; for instance, gaze predictions reveal that the system indeed performs subjectivity extraction better, which accounts for improvement in document level sentiment analysis performance.
Hierarchical Attention Transfer Network for Cross-Domain Sentiment Classification
Li, Zheng (Hong Kong University of Science and Technology) | Wei, Ying (Hong Kong University of Science and Technology) | Zhang, Yu (Hong Kong University of Science and Technology) | Yang, Qiang (Hong Kong University of Science and Technology)
Cross-domain sentiment classification aims to leverage useful information in a source domain to help do sentiment classification in a target domain that has no or little supervised information. Existing cross-domain sentiment classification methods cannot automatically capture non-pivots, i.e., the domain-specific sentiment words, and pivots, i.e., the domain-shared sentiment words, simultaneously. In order to solve this problem, we propose a Hierarchical Attention Transfer Network (HATN) for cross-domain sentiment classification. The proposed HATN provides a hierarchical attention transfer mechanism which can transfer attentions for emotions across domains by automatically capturing pivots and non-pivots. Besides, the hierarchy of the attention mechanism mirrors the hierarchical structure of documents, which can help locate the pivots and non-pivots better. The proposed HATN consists of two hierarchical attention networks, with one named P-net aiming to find the pivots and the other named NP-net aligning the non-pivots by using the pivots as a bridge. Specifically, P-net firstly conducts individual attention learning to provide positive and negative pivots for NP-net. Then, P-net and NP-net conduct joint attention learning such that the HATN can simultaneously capture pivots and non-pivots and realize transferring attentions for emotions across domains. Experiments on the Amazon review dataset demonstrate the effectiveness of HATN.
Weakly Supervised Induction of Affective Events by Optimizing Semantic Consistency
Ding, Haibo (University of Utah) | Riloff, Ellen (University of Utah)
To understand narrative text, we must comprehend how people are affected by the events that they experience. For example, readers understand that graduating from college is a positive event (achievement) but being fired from one's job is a negative event (problem). NLP researchers have developed effective tools for recognizing explicit sentiments, but affective events are more difficult to recognize because the polarity is often implicit and can depend on both a predicate and its arguments. Our research investigates the prevalence of affective events in a personal story corpus, and introduces a weakly supervised method for large scale induction of affective events. We present an iterative learning framework that constructs a graph with nodes representing events and initializes their affective polarities with sentiment analysis tools as weak supervision. The events are then linked based on three types of semantic relations: (1) semantic similarity, (2) semantic opposition, and (3) shared components. The learning algorithm iteratively refines the polarity values by optimizing semantic consistency across all events in the graph. Our model learns over 100,000 affective events and identifies their polarities more accurately than other methods.
Addressee and Response Selection in Multi-Party Conversations With Speaker Interaction RNNs
Zhang, Rui (Yale University) | Lee, Honglak (University of Michigan) | Polymenakos, Lazaros (IBM T. J. Watson Research Center) | Radev, Dragomir (Yale University)
In this paper, we study the problem of addressee and response selection in multi-party conversations. Understanding multi-party conversations is challenging because of complex speaker interactions: multiple speakers exchange messages with each other, playing different roles (sender, addressee, observer), and these roles vary across turns. To tackle this challenge, we propose the Speaker Interaction Recurrent Neural Network (SI-RNN). Whereas the previous state-of-the-art system updated speaker embeddings only for the sender, SI-RNN uses a novel dialog encoder to update speaker embeddings in a role-sensitive way. Additionally, unlike the previous work that selected the addressee and response separately, SI-RNN selects them jointly by viewing the task as a sequence prediction problem. Experimental results show that SI-RNN significantly improves the accuracy of addressee and response selection, particularly in complex conversations with many speakers and responses to distant messages many turns in the past.
Learning Latent Opinions for Aspect-level Sentiment Classification
Wang, Bailin (University of Massachusetts Amherst) | Lu, Wei (Singapore University of Technology and Design)
Aspect-level sentiment classification aims at detecting the sentiment expressed towards a particular target in a sentence. Based on the observation that the sentiment polarity is often related to specific spans in the given sentence, it is possible to make use of such information for better classification. On the other hand, such information can also serve as justifications associated with the predictions.We propose a segmentation attention based LSTM model which can effectively capture the structural dependencies between the target and the sentiment expressions with a linear-chain conditional random field (CRF) layer.ย The model simulates human's process of inferring sentiment information when reading: when given a target, humans tend to search for surrounding relevant text spans in the sentence before making an informed decision on the underlying sentiment information.We perform sentiment classification tasks on publicly available datasets on online reviews across different languages from SemEval tasks and social comments from Twitter. Extensive experiments show that our model achieves the state-of-the-art performance while extracting interpretable sentiment expressions. ย
CoChat: Enabling Bot and Human Collaboration for Task Completion
Luo, Xufang (Beihang University) | Lin, Zijia (Microsoft Research) | Wang, Yunhong (Beihang University) | Nie, Zaiqing (Alibaba AI Labs)
Chatbots have drawn significant attention of late in both industry and academia. For most task completion bots in the industry, human intervention is the only means of avoiding mistakes in complex real-world cases. However, to the best of our knowledge, there is no existing research work modeling the collaboration between task completion bots and human workers. In this paper, we introduce CoChat, a dialog management framework to enable effective collaboration between bots and human workers. In CoChat, human workers can introduce new actions at any time to handle previously unseen cases. We propose a memory-enhanced hierarchical RNN (MemHRNN) to handle the one-shot learning challenges caused by instantly introducing new actions in CoChat. Extensive experiments on real-world datasets well demonstrate that CoChat can relieve most of the human workersโ workload, and get better user satisfaction rates comparing to other state-of-the-art frameworks.
Customized Nonlinear Bandits for Online Response Selection in Neural Conversation Models
Liu, Bing (Carnegie Mellon University) | Yu, Tong ( Carnegie Mellon University ) | Lane, Ian ( Carnegie Mellon University ) | Mengshoel, Ole J. (Carnegie Mellon University)
Dialog response selection is an important step towards natural response generation in conversational agents. Existing work on neural conversational models mainly focuses on offline supervised learning using a large set of context-response pairs. In this paper, we focus on online learning of response selection in retrieval-based dialog systems. We propose a contextual multi-armed bandit model with a nonlinear reward function that uses distributed representation of text for online response selection. A bidirectional LSTM is used to produce the distributed representations of dialog context and responses, which serve as the input to a contextual bandit. In learning the bandit, we propose a customized Thompson sampling method that is applied to a polynomial feature space in approximating the reward. Experimental results on the Ubuntu Dialogue Corpus demonstrate significant performance gains of the proposed method over conventional linear contextual bandits. Moreover, we report encouraging response selection performance of the proposed neural bandit model using the Recall@k metric for a small set of online training samples.
IMS-DTM: Incremental Multi-Scale Dynamic Topic Models
Chen, Xilun (Arizona State University) | Candan, K. Selcuk (Arizona State University) | Sapino, Maria Luisa (University of Torino)
Dynamic topic models (DTM) are commonly used for mining latent topics in evolving web corpora. In this paper, we note that a major limitation of the conventional DTM based models is that they assume a predetermined and fixed scale of topics. In reality, however, topics may have varying spans and topics of multiple scales can co-exist in a single web or social media data stream. Therefore, DTMs that assume a fixed epoch length may not be able to effectively capture latent topics and thus negatively affect accuracy. In this paper, we propose a Multi-Scale Dynamic Topic Model (MS-DTM) and a complementary Incremental Multi-Scale Dynamic Topic Model (IMS-DTM) inference method that can be used to capture latent topics and their dynamics simultaneously, at different scales. In this model, topic specific feature distributions are generated based on a multi-scale feature distribution of the previous epochs; moreover, multiple scales of the current epoch are analyzed together through a novel multi-scale incremental Gibbs sampling technique. We show that the proposed model significantly improves efficiency and effectiveness compared to the single scale dynamic DTMs and prior models that consider only multiple scales of the past.