Understanding Language in Conversations "The problems addressed in discourse research aim to answer two general kinds of questions: (1) what information is contained in extended sequences of utterances that goes beyond the meaning of the individual utterances themselves? (2) how does the context in which an utterance is used affect the meaning of the individual utterances, or parts of them?"
– Barbara Grosz. Overview of Chapter 6: Discourse and Dialogue, Survey of the State of the Art in Human Language Technology (1996).
Consumer sentiment in Japan weakened for the 12th straight month in September, hitting its lowest level since the survey started in April 2013, a Cabinet Office survey showed Wednesday. The seasonally adjusted consumer confidence index dropped 1.5 points from the previous month to 35.6, ahead of Tuesday's consumption tax increase. The survey covered households of two or more people. The index was lower than the 37.1 marked when the consumption tax was previously raised in April 2014. The latest tax increase may dent personal consumption, a key engine of Japan's economic growth at a time when exports have been dropping amid a prolonged U.S.-China trade war.
Neural end-to-end goal-oriented dialog systems showed promise to reduce the workload of human agents for customer service, as well as reduce wait time for users. However, their inability to handle new user behavior at deployment has limited their usage in real world. In this work, we propose an end-to-end trainable method for neural goal-oriented dialog systems which handles new user behaviors at deployment by transferring the dialog to a human agent intelligently. The proposed method has three goals: 1) maximize user's task success by transferring to human agents, 2) minimize the load on the human agents by transferring to them only when it is essential and 3) learn online from the human agent's responses to reduce human agents load further. We evaluate our proposed method on a modified-bAbI dialog task that simulates the scenario of new user behaviors occurring at test time. Experimental results show that our proposed method is effective in achieving the desired goals.
Topic modeling analyzes documents to learn meaningful patterns of words. Dynamic topic models capture how these patterns vary over time for a set of documents that were collected over a large time span. We develop the dynamic embedded topic model (D-ETM), a generative model of documents that combines dynamic latent Dirichlet allocation (D-LDA) and word embeddings. The D-ETM models each word with a categorical distribution whose parameter is given by the inner product between the word embedding and an embedding representation of its assigned topic at a particular time step. The word embeddings allow the D-ETM to generalize to rare words. The D-ETM learns smooth topic trajectories by defining a random walk prior over the embeddings of the topics. We fit the D-ETM using structured amortized variational inference. On a collection of United Nations debates, we find that the D-ETM learns interpretable topics and outperforms D-LDA in terms of both topic quality and predictive performance.
This paper proposes a new HDP based online review rating regression model named Topic-Sentiment-Preference Regression Analysis (TSPRA). TSPRA combines topics (i.e. product aspects), word sentiment and user preference as regression factors, and is able to perform topic clustering, review rating prediction, sentiment analysis and what we invent as "critical aspect" analysis altogether in one framework. TSPRA extends sentiment approaches by integrating the key concept "user preference" in collaborative filtering (CF) models into consideration, while it is distinct from current CF models by decoupling "user preference" and "sentiment" as independent factors. Our experiments conducted on 22 Amazon datasets show overwhelming better performance in rating predication against a state-of-art model FLAME (2015) in terms of error, Pearson's Correlation and number of inverted pairs. For sentiment analysis, we compare the derived word sentiments against a public sentiment resource SenticNet3 and our sentiment estimations clearly make more sense in the context of online reviews. Last, as a result of the de-correlation of "user preference" from "sentiment", TSPRA is able to evaluate a new concept "critical aspects", defined as the product aspects seriously concerned by users but negatively commented in reviews. Improvement to such "critical aspects" could be most effective to enhance user experience.
Mining financial text documents and understanding the sentiments of individual investors, institutions and markets is an important and challenging problem in the literature. Current approaches to mine sentiments from financial texts largely rely on domain specific dictionaries. However, dictionary based methods often fail to accurately predict the polarity of financial texts. This paper aims to improve the state-of-the-art and introduces a novel sentiment analysis approach that employs the concept of financial and non-financial performance indicators. It presents an association rule mining based hierarchical sentiment classifier model to predict the polarity of financial texts as positive, neutral or negative. The performance of the proposed model is evaluated on a benchmark financial dataset. The model is also compared against other state-of-the-art dictionary and machine learning based approaches and the results are found to be quite promising. The novel use of performance indicators for financial sentiment analysis offers interesting and useful insights.
We present a neural framework for opinion summarization from online product reviews which is knowledge-lean and only requires light supervision (e.g., in the form of product domain labels and user-provided ratings). Our method combines two weakly supervised components to identify salient opinions and form extractive summaries from multiple reviews: an aspect extractor trained under a multi-task objective, and a sentiment predictor based on multiple instance learning. We introduce an opinion summarization dataset that includes a training set of product reviews from six diverse domains and human-annotated development and test sets with gold standard aspect annotations, salience labels, and opinion summaries. Automatic evaluation shows significant improvements over baselines, and a large-scale study indicates that our opinion summaries are preferred by human judges according to multiple criteria.
Over years, a crucial part of data-gathering behavior has revolved around what other people think. With the constantly growing popularity and availability of opinion-driven resources such as personal blogs and online review sites, new challenges and opportunities are emerging as people have started using advanced technologies to make decisions now. Sentiment analysis or opinion mining, refers to the use of computational linguistics, text analytics and natural language processing to identify and extract information from source materials. Sentiment analysis is considered one of the most popular applications of text analytics. The primary aspect of sentiment analysis includes data analysis on the body of the text for understanding the opinion expressed by it and other key factors comprising modality and mood.
Inspired by recent works in Aspect-Based Sentiment Analysis (ABSA) on product reviews and faced with more complex posts on social media platforms mentioning multiple entities as well as multiple aspects, we define a novel task called Multi-Entity Aspect-Based Sentiment Analysis (ME-ABSA). This task aims at fine-grained sentiment analysis of (entity, aspect) combinations, making the well-studied ABSA task a special case of it. To address the task, we propose an innovative method that models Context memory, Entity memory and Aspect memory, called CEA method. Our experimental results show that our CEA method achieves a significant gain over several baselines, including the state-of-the-art method for the ABSA task, and their enhanced versions, on datasets for ME-ABSA and ABSA tasks. The in-depth analysis illustrates the significant advantage of the CEA method over baseline methods for several hard-to-predict post types. Furthermore, we show that the CEA method is capable of generalizing to new (entity, aspect) combinations with little loss of accuracy. This observation indicates that data annotation in real applications can be largely simplified.
Bayesian inference methods for probabilistic topic models can quantify uncertainty in the parameters, which has primarily been used to increase the robustness of parameter estimates. In this work, we explore other rich information that can be obtained by analyzing the posterior distributions in topic models. Experimenting with latent Dirichlet allocation on two datasets, we propose ideas incorporating information about the posterior distributions at the topic level and at the word level. At the topic level, we propose a metric called topic stability that measures the variability of the topic parameters under the posterior. We show that this metric is correlated with human judgments of topic quality as well as with the consistency of topics appearing across multiple models. At the word level, we experiment with different methods for adjusting individual word probabilities within topics based on their uncertainty. Humans prefer words ranked by our adjusted estimates nearly twice as often when compared to the traditional approach. Finally, we describe how the ideas presented in this work could potentially applied to other predictive or exploratory models in future work.
Li, Zheng (Hong Kong University of Science and Technology) | Wei, Ying (Hong Kong University of Science and Technology) | Zhang, Yu (Hong Kong University of Science and Technology) | Yang, Qiang (Hong Kong University of Science and Technology)
Cross-domain sentiment classification aims to leverage useful information in a source domain to help do sentiment classification in a target domain that has no or little supervised information. Existing cross-domain sentiment classification methods cannot automatically capture non-pivots, i.e., the domain-specific sentiment words, and pivots, i.e., the domain-shared sentiment words, simultaneously. In order to solve this problem, we propose a Hierarchical Attention Transfer Network (HATN) for cross-domain sentiment classification. The proposed HATN provides a hierarchical attention transfer mechanism which can transfer attentions for emotions across domains by automatically capturing pivots and non-pivots. Besides, the hierarchy of the attention mechanism mirrors the hierarchical structure of documents, which can help locate the pivots and non-pivots better. The proposed HATN consists of two hierarchical attention networks, with one named P-net aiming to find the pivots and the other named NP-net aligning the non-pivots by using the pivots as a bridge. Specifically, P-net firstly conducts individual attention learning to provide positive and negative pivots for NP-net. Then, P-net and NP-net conduct joint attention learning such that the HATN can simultaneously capture pivots and non-pivots and realize transferring attentions for emotions across domains. Experiments on the Amazon review dataset demonstrate the effectiveness of HATN.