Sentiment Analysis of Short Informal Texts

Journal of Artificial Intelligence Research

We describe a state-of-the-art sentiment analysis system that detects (a) the sentiment of short informal textual messages such as tweets and SMS (message-level task) and (b) the sentiment of a word or a phrase within a message (term-level task). The system is based on a supervised statistical text classification approach leveraging a variety of surface-form, semantic, and sentiment features. The sentiment features are primarily derived from novel high-coverage tweet-specific sentiment lexicons. These lexicons are automatically generated from tweets with sentiment-word hashtags and from tweets with emoticons. To adequately capture the sentiment of words in negated contexts, a separate sentiment lexicon is generated for negated words. The system ranked first in the SemEval-2013 shared task `Sentiment Analysis in Twitter' (Task 2), obtaining an F-score of 69.02 in the message-level task and 88.93 in the term-level task. Post-competition improvements boost the performance to an F-score of 70.45 (message-level task) and 89.50 (term-level task). The system also obtains state-of-the-art performance on two additional datasets: the SemEval-2013 SMS test set and a corpus of movie review excerpts. The ablation experiments demonstrate that the use of the automatically generated lexicons results in performance gains of up to 6.5 absolute percentage points.


Acquiring Knowledge of Affective Events from Blogs Using Label Propagation

AAAI Conferences

Many common events in our daily life affect us in positive and negative ways. For example, going on vacation is typically an enjoyable event, while being rushed to the hospital is an undesirable event. In narrative stories and personal conversations, recognizing that some events have a strong affective polarity is essential to understand the discourse and the emotional states of the affected people. However, current NLP systems mainly depend on sentiment analysis tools, which fail to recognize many events that are implicitly affective based on human knowledge about the event itself and cultural norms. Our goal is to automatically acquire knowledge of stereotypically positive and negative events from personal blogs. Our research creates an event context graph from a large collection of blog posts and uses a sentiment classifier and semi-supervised label propagation algorithm to discover affective events. We explore several graph configurations that propagate affective polarity across edges using local context, discourse proximity, and event-event co-occurrence. We then harvest highly affective events from the graph and evaluate the agreement of the polarities with human judgements.


Expanding Domain Sentiment Lexicon through Double Propagation

AAAI Conferences

In most sentiment analysis applications, the sentiment lexicon plays a key role. However, it is hard, if not impossible, to collect and maintain a universal sentiment lexicon for all application domains because different words may be used in different domains. The main existing technique extracts such sentiment words from a large domain corpus based on different conjunctions and the idea of sentiment coherency in a sentence. In this paper, we propose a novel propagation approach that exploits the relations between sentiment words and topics or product features that the sentiment words modify, and also sentiment words and product features themselves to extract new sentiment words. As the method propagates information through both sentiment words and features, we call it double propagation. The extraction rules are designed based on relations described in dependency trees. A new method is also proposed to assign polarities to newly discovered sentiment words in a domain. Experimental results show that our approach is able to extract a large number of new sentiment words. The polarity assignment method is also effective.


A Novel Human Computation Game for Critique Aggregation

AAAI Conferences

We present a human computation game based on the popular board game - Dixit. We ask the players not only for annotations, but for a direct critique of the result of an automated system.We present the results of the initial run of the game, in which the answers of 15 players were used to profile the mistakes of an aspect-based opinion mining system. We show that the gameplay allowed us to identify the major faults of the extracted opinions. The players' actions thus helped improve the opinion extraction algorithm.


Co-Training Based Bilingual Sentiment Lexicon Learning

AAAI Conferences

In this paper, we address the issue of bilingual sentiment lexicon learning(BSLL) which aims to automatically and simultaneously generate sentiment words for two languages. The underlying motivation is that sentiment information from two languages can perform iterative mutual-teaching in the learning procedure. We propose to develop two classifiers to determine the sentiment polarities of words under a co-training framework, which makes full use of the two-view sentiment information from the two languages. The word alignment derived from the parallel corpus is leveraged to design effective features and to bridge the learning of the two classifiers. The experimental results on English and Chinese languages show the effectiveness of our approach in BSLL.