Co-Training Based Bilingual Sentiment Lexicon Learning

AAAI Conferences

In this paper, we address the issue of bilingual sentiment lexicon learning(BSLL) which aims to automatically and simultaneously generate sentiment words for two languages. The underlying motivation is that sentiment information from two languages can perform iterative mutual-teaching in the learning procedure. We propose to develop two classifiers to determine the sentiment polarities of words under a co-training framework, which makes full use of the two-view sentiment information from the two languages. The word alignment derived from the parallel corpus is leveraged to design effective features and to bridge the learning of the two classifiers. The experimental results on English and Chinese languages show the effectiveness of our approach in BSLL.


5 Things You Need to Know about Sentiment Analysis and Classification

@machinelearnbot

In the last years, Sentiment Analysis has become a hot-trend topic of scientific and market research in the field of Natural Language Processing (NLP) and Machine Learning. Below, you can find 5 useful things you need to know about Sentiment Analysis that are connected to Social Media, Datasets, Machine Learning, Visualizations, and Evaluation Methods applied by researchers and market experts. Sentiment Analysis examines the problem of studying texts, like posts and reviews, uploaded by users on microblogging platforms, forums, and electronic businesses, regarding the opinions they have about a product, service, event, person or idea. The most common use of Sentiment Analysis is this of classifying a text to a class. Depending on the dataset and the reason, Sentiment Classification can be binary (positive or negative) or multi-class (3 or more classes) problem.


Radical-Based Hierarchical Embeddings for Chinese Sentiment Analysis at Sentence Level

AAAI Conferences

Text representation in Chinese sentiment analysis is usually working at word or character level. In this paper, we prove that radical-level processing could greatly improve sentiment classification performance. In particular, we propose two types of Chinese radical-based hierarchical embeddings. The embeddings incorporate not only semantics at radical and character level, but also sentiment information. In the evaluation of our embeddings, we conduct Chinese sentiment analysis at sentence level on four different datasets. Experimental results validate our assumption that radical-level semantics and sentiments can contribute to sentence-level sentiment classification and demonstrate the superiority of our embeddings over classic textual features and popular word and character embeddings.


Identifying Sentiment Words Using an Optimization Model with L1 Regularization

AAAI Conferences

Sentiment word identification is a fundamental work in numerous applications of sentiment analysis and opinion mining, such as review mining, opinion holder finding, and twitter classification. In this paper, we propose an optimization model with L1 regularization, called ISOMER, for identifying the sentiment words from the corpus. Our model can employ both seed words and documents with sentiment labels, different from most existing researches adopting seed words only. The L1 penalty in the objective function yields a sparse solution since most candidate words have no sentiment. The experiments on the real datasets show that ISOMER outperforms the classic approaches, and that the lexicon learned by ISOMER can be effectively adapted to document-level sentiment analysis.


iFeel 2.0: A Multilingual Benchmarking System for Sentence-Level Sentiment Analysis

AAAI Conferences

Sentiment analysis became a hot topic, specially with the amount of opinions available in social media data. With the increasing interest in this theme, several methods have been proposed in the literature. Recent efforts have showed that there is no single method that always achieves the best prediction performance for different datasets. Additionally, novel methods have not being extensively compared with other methods and across different datasets, specially methods that are not designed to the English language. Consequently, researchers tend to accept any popular method as a valid methodology to measure sentiments, a practice that is usual in science. In this context, we propose iFeel 2.0, an online web system that implements 19 sentence-level sentiment analysis methods and allows users to easily label a dataset with all of them. iFeel aims at easing the comparison of new methods with baseline approaches and can also be helpful for those interested in using sentiment analysis, allowing them to choose an appropriate sentiment analysis method that works fine for a new dataset. We also incorporate a multiple language feature to allow methods designed for specific languages to be easily compared with a baseline approach that simply translates the input data to English and run these 19 methods. We hope this system can represent an important contribution to this field. Sentiment analysis became a hot topic, specially with the amount of opinions available in social media data.With the increasing interest in this theme, several methods have been proposed in the literature. Recent effortshave showed that there is no single method that always achieves the best prediction performance for different datasets. Additionally, novel methods have not being extensively compared with other methods and across different datasets, specially methods that are not designed to the English language.Consequently, researchers tend to accept any popular method as a valid methodology to measure sentiments, a practice that is usual in science.In this context, we propose iFeel 2.0, an online web system that implements 19 sentence-level sentiment analysis methods and allows users to easily label a dataset with all of them. iFeel aims at easing the comparison of new methods with baseline approaches and can also be helpful for those interested in using sentiment analysis, allowing them to choose an appropriate sentiment analysis method that works fine for a new dataset.We also incorporate a multiple language feature to allow methods designed for specific languages to be easily compared with a baseline approach that simply translates the input data to English and run these 19 methods. We hope this system can represent an important contribution to this field.