With the development of Web 2.0, sentiment analysis has now become a popular research problem to tackle. Recently, topic models have been introduced for the simultaneous analysis for topics and the sentiment in a document. These studies, which jointly model topic and sentiment, take the advantage of the relationship between topics and sentiment, and are shown to be superior to traditional sentiment analysis tools. However, most of them make the assumption that, given the parameters, the sentiments of the words in the document are all independent. In our observation, in contrast, sentiments are expressed in a coherent way. The local conjunctive words, such as “and” or “but”, are often indicative of sentiment transitions. In this paper, we propose a major departure from the previous approaches by making two linked contributions. First, we assume that the sentiments are related to the topic in the document, and put forward a joint sentiment and topic model, i.e. Sentiment-LDA. Second, we observe that sentiments are dependent on local context. Thus, we further extend the Sentiment-LDA model to Dependency-Sentiment-LDA model by relaxing the sentiment independent assumption in Sentiment-LDA. The sentiments of words are viewed as a Markov chain in Dependency-Sentiment-LDA. Through experiments, we show that exploiting the sentiment dependency is clearly advantageous, and that the Dependency-Sentiment-LDA is an effective approach for sentiment analysis.
We explore the relationship between negated text and negative sentiment in the task of sentiment classification. We propose a novel adjustment factor based on negation occurrences as a proxy for negative sentiment that can be applied to lexicon-based classifiers equipped with a negation detection pre-processing step. We performed an experiment on a multi-domain customer reviews dataset obtaining accuracy improvements over a baseline, and we further improved our results using out-of-domain data to calibrate the adjustment factor. We see future work possibilities in exploring negation detection refinements, and expanding the experiment to a broader spectrum of opinionated discourse, beyond that of customer reviews.
To help users quickly understand the major opinions from massive online reviews, it is important to automatically reveal the latent structure of the aspects, sentiment polarities, and the association between them. However, there is little work available to do this effectively. In this paper, we propose a hierarchical aspect sentiment model (HASM) to discover a hierarchical structure of aspect-based sentiments from unlabeled online reviews. In HASM, the whole structure is a tree. Each node itself is a two-level tree, whose root represents an aspect and the children represent the sentiment polarities associated with it. Each aspect or sentiment polarity is modeled as a distribution of words. To automatically extract both the structure and parameters of the tree, we use a Bayesian nonparametric model, recursive Chinese Restaurant Process (rCRP), as the prior and jointly infer the aspect-sentiment tree from the review texts. Experiments on two real datasets show that our model is comparable to two other hierarchical topic models in terms of quantitative measures of topic trees. It is also shown that our model achieves better sentence-level classification accuracy than previously proposed aspect-sentiment joint models.
In this paper, we focus on the task of extracting named entities together with their associated sentiment information in a joint manner. Our key observation in such an entity-level sentiment analysis (a.k.a. targeted sentiment analysis) task is that there exists a sentiment scope within which each named entity is embedded, which largely decides the sentiment information associated with the entity. However, such sentiment scopes are typically not explicitly annotated in the data, and their lengths can be unbounded. Motivated by this, unlike traditional approaches that cast this problem as a simple sequence labeling task, we propose a novel approach that can explicitly model the latent sentiment scopes. Our experiments on the standard datasets demonstrate that our approach is able to achieve better results compared to existing approaches based on conventional conditional random fields (CRFs) and a more recent work based on neural networks.
Sentiment analysis research has predominantly been on English texts. Thus there exist many sentiment resources for English, but less so for other languages. Approaches to improve sentiment analysis in a resource-poor focus language include: (a) translate the focus language text into a resource-rich language such as English, and apply a powerful English sentiment analysis system on the text, and (b) translate resources such as sentiment labeled corpora and sentiment lexicons from English into the focus language, and use them as additional resources in the focus-language sentiment analysis system. In this paper we systematically examine both options. We use Arabic social media posts as stand-in for the focus language text. We show that sentiment analysis of English translations of Arabic texts produces competitive results, w.r.t.