Many consumer brands have customer relationship agents that directly engage opinionated consumers on social streams, such as Twitter. To help agents find opinionated consumers, social stream monitoring tools provide keyword-based filters, which are often too coarse-grained to be effective. In this work, we introduce CrowdE, a Twitter-based filtering system that helps agents find opinionated customers through brand-specific intelligent filters. To minimize per-brand effort in creating these brand-specific filters, the system used a common crowd-enabled process that creates the filters through machine learning over crowd-labeled tweets. We validated the quality of the crowd labels and the performance of the filter algorithms built from the labels. A user evaluation further showed that CrowdE's intelligent filters improved task performance and were generally preferred by users in comparison to keyword-based filters in current social stream monitoring tools.
Natural Language Processing (NLP) is a subfield of machine learning concerned with processing and analyzing natural language data, usually in the form of text or audio. Some common challenges within NLP include speech recognition, text generation, and sentiment analysis, while some high-profile products deploying NLP models include Apple's Siri, Amazon's Alexa, and many of the chatbots one might interact with online. To get started with NLP and introduce some of the core concepts in the field, we're going to build a model that tries to predict the sentiment (positive, neutral, or negative) of tweets relating to US Airlines, using the popular Twitter US Airline Sentiment dataset. Code snippets will be included in this post, but for fully reproducible notebooks and scripts, view all of the notebooks and scripts associated with this project on its Comet project page. Let's start by importing some libraries.
Due to the vast amount of user-generated content in the emerging Web 2.0, there is a growing need for computational processing of sentiment analysis in documents. Most of the current research in this field is devoted to product reviews from websites. Microblogs and social networks pose even a greater challenge to sentiment classification. However, especially marketing and political campaigns leverage from opinions expressed on Twitter or other social communication platforms. The objects of interest in this paper are the presidential candidates of the Republican Party in the USA and their campaign topics. In this paper we introduce the combination of the noun phrases’ frequency and their PMI measure as constraint on aspect extraction. This compensates for sparse phrases receiving a higher score than those composed of high-frequency words. Evaluation shows that the meronymy relationship between politicians and their topics holds and improves accuracy of aspect extraction.
We describe a state-of-the-art sentiment analysis system that detects (a) the sentiment of short informal textual messages such as tweets and SMS (message-level task) and (b) the sentiment of a word or a phrase within a message (term-level task). The system is based on a supervised statistical text classification approach leveraging a variety of surface-form, semantic, and sentiment features. The sentiment features are primarily derived from novel high-coverage tweet-specific sentiment lexicons. These lexicons are automatically generated from tweets with sentiment-word hashtags and from tweets with emoticons. To adequately capture the sentiment of words in negated contexts, a separate sentiment lexicon is generated for negated words. The system ranked first in the SemEval-2013 shared task `Sentiment Analysis in Twitter' (Task 2), obtaining an F-score of 69.02 in the message-level task and 88.93 in the term-level task. Post-competition improvements boost the performance to an F-score of 70.45 (message-level task) and 89.50 (term-level task). The system also obtains state-of-the-art performance on two additional datasets: the SemEval-2013 SMS test set and a corpus of movie review excerpts. The ablation experiments demonstrate that the use of the automatically generated lexicons results in performance gains of up to 6.5 absolute percentage points.
Sentiment analysis is a powerful example of how machine learning can help developers build better products with unique features. In short, sentiment analysis is the automated process of understanding if text written in a natural language (English, Spanish, etc.) is positive, neutral, or negative about a given subject. Nowadays, we have many instances where people express opinions and sentiment: tweets, comments, reviews, articles, chats, emails and more. One popular example is Twitter, where real-time opinions from millions of users are expressed constantly. Companies use sentiment analysis on Twitter to discover insights about their products and services.