This paper describes the Amobee sentiment analysis system, adapted to compete in SemEval 2017 task 4. The system consists of two parts: a supervised training of RNN models based on a Twitter sentiment treebank, and the use of feedforward NN, Naive Bayes and logistic regression classifiers to produce predictions for the different sub-tasks. The algorithm reached the 3rd place on the 5-label classification task (sub-task C).
Deep neural networks have gained great success recently for sentiment classification. However, these approaches do not fully exploit the linguistic knowledge. In this paper, we propose a novel sentiment lexicon enhanced attention-based LSTM (SLEA-LSTM) model to improve the performance of sentence-level sentiment classification. Our method successfully integrates sentiment lexicon into deep neural networks via single-head or multi-head attention mechanisms. We conduct extensive experiments on MR and SST datasets. The experimental results show that our model achieved comparable or better performance than the state-of-the-art methods.
Where's the best place to look for free online datasets for NLP? We combed the web to create the ultimate cheat sheet, broken down into datasets for text, audio speech, and sentiment analysis. Sentiment140: a popular dataset, which uses 160,000 tweets with emoticons pre-removed. Twitter US Airline Sentiment: Twitter data on US airlines from February 2015, classified as positive, negative, and neutral tweets. Yelp Reviews: An open dataset released by Yelp, contains more than 5 million reviews.
Fu, Peng (Institute of Information Engineering, Chinese Academic of Sciences) | Lin, Zheng (Institute of Information Engineering, Chinese Academic of Sciences) | Yuan, Fengcheng (Institute of Information Engineering, Chinese Academic of Sciences) | Wang, Weiping (Institute of Information Engineering, Chinese Academic of Sciences) | Meng, Dan (Institute of Information Engineering, Chinese Academic of Sciences)
Context-based word embedding learning approaches can model rich semantic and syntactic information. However, it is problematic for sentiment analysis because the words with similar contexts but opposite sentiment polarities, such as good and bad, are mapped into close word vectors in the embedding space. Recently, some sentiment embedding learning methods have been proposed, but most of them are designed to work well on sentence-level texts. Directly applying those models to document-level texts often leads to unsatisfied results. To address this issue, we present a sentiment-specific word embedding learning architecture that utilizes local context informationas well as global sentiment representation. The architecture is applicable for both sentence-level and document-level texts. We take global sentiment representation as a simple average of word embeddings in the text, and use a corruption strategy as a sentiment-dependent regularization. Extensive experiments conducted on several benchmark datasets demonstrate that the proposed architecture outperforms the state-of-the-art methods for sentiment classification.
An obstacle to the development of many natural language processing products is the vast amount of training examples necessary to get satisfactory results. The generation of these examples is often a tedious and time-consuming task. This paper this paper proposes a method to transform the sentiment of sentences in order to limit the work necessary to generate more training data. This means that one sentence can be transformed to an opposite sentiment sentence and should reduce by half the work required in the generation of text. The proposed pipeline consists of a sentiment classifier with an attention mechanism to highlight the short phrases that determine the sentiment of a sentence. Then, these phrases are changed to phrases of the opposite sentiment using a baseline model and an autoencoder approach. Experiments are run on both the separate parts of the pipeline as well as on the end-to-end model. The sentiment classifier is tested on its accuracy and is found to perform adequately. The autoencoder is tested on how well it is able to change the sentiment of an encoded phrase and it was found that such a task is possible. We use human evaluation to judge the performance of the full (end-to-end) pipeline and that reveals that a model using word vectors outperforms the encoder model. Numerical evaluation shows that a success rate of 54.7% is achieved on the sentiment change.