Goto

Collaborating Authors

 hedonometer


A Comparison of Lexicon-Based and ML-Based Sentiment Analysis: Are There Outlier Words?

arXiv.org Artificial Intelligence

Lexicon-based approaches to sentiment analysis of text are based on each word or lexical entry having a pre-defined weight indicating its sentiment polarity. These are usually manually assigned but the accuracy of these when compared against machine leaning based approaches to computing sentiment, are not known. It may be that there are lexical entries whose sentiment values cause a lexicon-based approach to give results which are very different to a machine learning approach. In this paper we compute sentiment for more than 150,000 English language texts drawn from 4 domains using the Hedonometer, a lexicon-based technique and Azure, a contemporary machine-learning based approach which is part of the Azure Cognitive Services family of APIs which is easy to use. We model differences in sentiment scores between approaches for documents in each domain using a regression and analyse the independent variables (Hedonometer lexical entries) as indicators of each word's importance and contribution to the score differences. Our findings are that the importance of a word depends on the domain and there are no standout lexical entries which systematically cause differences in sentiment scores.


What is sentiment analysis? Using NLP and ML to extract meaning

#artificialintelligence

Sentiment analysis is analytical technique that uses statistics, natural language processing, and machine learning to determine the emotional meaning of communications. Companies use sentiment analysis to evaluate customer messages, call center interactions, online reviews, social media posts, and other content. Sentiment analysis can track changes in attitudes towards companies, products, or services, or individual features of those products or services. Get the latest insights with our CIO Daily newsletter. One of the most prominent examples of sentiment analysis on the Web today is the Hedonometer, a project of the University of Vermont's Computational Story Lab.