A Data-Driven Study of View Duration on YouTube

AAAI Conferences

Video watching had emerged as one of the most frequent media activities on the Internet. Yet, little is known about how users watch online video. Using two distinct YouTube datasets, a set of random YouTube videos crawled from the Web and a set of videos watched by participants tracked by a Chrome extension, we examine whether and how indicators of collective preferences and reactions are associated with view duration of videos. We show that video view duration is positively associated with the video's view count, the number of likes per view, and the negative sentiment in the comments. These metrics and reactions have a significant predictive power over the duration the video is watched by individuals. Our findings provide a more precise understandings of user engagement with video content in social media beyond view count.

VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text

AAAI Conferences

The inherent nature of social media content poses serious challenges to practical applications of sentiment analysis. We present VADER, a simple rule-based model for general sentiment analysis, and compare its effectiveness to eleven typical state-of-practice benchmarks including LIWC, ANEW, the General Inquirer, SentiWordNet, and machine learning oriented techniques relying on Naive Bayes, Maximum Entropy, and Support Vector Machine (SVM) algorithms. Using a combination of qualitative and quantitative methods, we first construct and empirically validate a gold-standard list of lexical features (along with their associated sentiment intensity measures) which are specifically attuned to sentiment in microblog-like contexts. We then combine these lexical features with consideration for five general rules that embody grammatical and syntactical conventions for expressing and emphasizing sentiment intensity. Interestingly, using our parsimonious rule-based model to assess the sentiment of tweets, we find that VADER outperforms individual human raters (F1 Classification Accuracy = 0.96 and 0.84, respectively), and generalizes more favorably across contexts than any of our benchmarks.

Which company does the best job at image recognition? Microsoft, Amazon, Google, or IBM? ZDNet


Sometimes recognition software is excellent at correctly categorizing certain types of images but totally fails with others. Some image recognition engines prefer cats over dogs, and some are far more descriptive with their color knowledge. But which is the best overall? Perficient Digital's image recognition accuracy study looked at image recognition -- one of the hottest areas of machine learning. It looked at Amazon AWS Rekognition, Google Vision, IBM Watson, and Microsoft Azure Computer Vision to compare images.

Loneliness in a Connected World: Analyzing Online Activity and Expressions on Real Life Relationships of Lonely Users

AAAI Conferences

Although loneliness is a very familiar emotion, little is known about it. An aspect to explore is the prevalence of loneliness in the connected world that social media sites like Twitter provide. In light of this, this study investigates the Twitter data of users that have expressed loneliness to understand the phenomenon. Since our primary material are tweets, we developed various indices that can measure social activities reflected in online relationships and real life relationship solely through online Twitter data. Through these indices, the relations between social activity and loneliness were investigated. The results show that high lonely users seem to have low online activity, high positive expressions on real life relationships, and narrow ingroups.

Emoticons and Phrases: Status Symbols in Social Media

AAAI Conferences

There is a sociolinguistic interest in studying the socialpower dynamics that arise on online social networksand how these are reflected in their users’ use of lan-guage. Online social power prediction can also be usedto build tools for marketing and political campaigns thathelp them build an audience. Existing work has focusedon finding correlations between status and linguistic fea-tures in email, Wikipedia discussions, and court hearings.While a few studies have tried predicting status on thebasis of language on Twitter, they have proved less fruit-ful. We derive a rich set of features from literature ina variety of disciplines and build classifiers that assignTwitter users to different levels of status based on theirlanguage use. Using various metrics such as number offollowers and Klout score, we achieve a classification ac-curacy of individual users as high as 82.4%. In a secondstep, we reached up to 71.6% accuracy on the task of pre-dicting the more powerful user in a dyadic conversation.We find that the manner in which powerful users writediffers from low status users in a number of differentways: not only in the extent to which they deviate fromtheir usual writing habits when conversing with othersbut also in pronoun use, language complexity, sentimentexpression, and emoticon use. By extending our analysisto Facebook, we also assess the generalisability of ourresults and discuss differences and similarities betweenthese two sites.