Analyzing the Political Sentiment of Tweets in Farsi

AAAI Conferences

We examine the question of whether we can automatically classify the sentiment of individual tweets in Farsi, to determine their changing sentiments over time toward a number of trending political topics. Examining tweets in Farsi adds challenges such as the lack of a sentiment lexicon and part-of-speech taggers, frequent use of colloquial words, and unique orthography and morphology characteristics. We have collected over 1 million Tweets on political topics in the Farsi language, with an annotated data set of over 3,000 tweets. We find that an SVM classifier with Brown clustering for feature selection yields a median accuracy of 56% and accuracy as high as 70%. We use this classifier to track dynamic sentiment during a key period of Irans negotiations over its nuclear program.

A Data-Driven Study of View Duration on YouTube

AAAI Conferences

Video watching had emerged as one of the most frequent media activities on the Internet. Yet, little is known about how users watch online video. Using two distinct YouTube datasets, a set of random YouTube videos crawled from the Web and a set of videos watched by participants tracked by a Chrome extension, we examine whether and how indicators of collective preferences and reactions are associated with view duration of videos. We show that video view duration is positively associated with the video's view count, the number of likes per view, and the negative sentiment in the comments. These metrics and reactions have a significant predictive power over the duration the video is watched by individuals. Our findings provide a more precise understandings of user engagement with video content in social media beyond view count.

Loneliness in a Connected World: Analyzing Online Activity and Expressions on Real Life Relationships of Lonely Users

AAAI Conferences

Although loneliness is a very familiar emotion, little is known about it. An aspect to explore is the prevalence of loneliness in the connected world that social media sites like Twitter provide. In light of this, this study investigates the Twitter data of users that have expressed loneliness to understand the phenomenon. Since our primary material are tweets, we developed various indices that can measure social activities reflected in online relationships and real life relationship solely through online Twitter data. Through these indices, the relations between social activity and loneliness were investigated. The results show that high lonely users seem to have low online activity, high positive expressions on real life relationships, and narrow ingroups.

Lehigh research team to investigate a 'Google for research data'


IMAGE: Brian Davison, Associate Professor of Computer Science Engineering at Lehigh University, is principal investigator of an NSF-backed project to develop a search engine intended to help scientists and others locate... view more There was a time--not that long ago--when the phrases "Google it" or "check Yahoo" would have been interpreted as sneezes, or a perhaps symptoms of an oncoming seizure, rather than as coherent thoughts. Today, these are key to answering all of life's questions. It's one thing to use the Web to keep up with a Kardashian, shop for ironic T-shirts, argue with our in-laws about politics, or any of the other myriad ways we use the Web in today's world. But if you are a serious researcher looking for real data that can help you advance your ideas, how useful are the underlying technologies that support the search engines we've all come to take for granted? "Not very," says Brian Davison, associate professor of computer science at Lehigh University.

Emoticons and Phrases: Status Symbols in Social Media

AAAI Conferences

There is a sociolinguistic interest in studying the socialpower dynamics that arise on online social networksand how these are reflected in their users’ use of lan-guage. Online social power prediction can also be usedto build tools for marketing and political campaigns thathelp them build an audience. Existing work has focusedon finding correlations between status and linguistic fea-tures in email, Wikipedia discussions, and court hearings.While a few studies have tried predicting status on thebasis of language on Twitter, they have proved less fruit-ful. We derive a rich set of features from literature ina variety of disciplines and build classifiers that assignTwitter users to different levels of status based on theirlanguage use. Using various metrics such as number offollowers and Klout score, we achieve a classification ac-curacy of individual users as high as 82.4%. In a secondstep, we reached up to 71.6% accuracy on the task of pre-dicting the more powerful user in a dyadic conversation.We find that the manner in which powerful users writediffers from low status users in a number of differentways: not only in the extent to which they deviate fromtheir usual writing habits when conversing with othersbut also in pronoun use, language complexity, sentimentexpression, and emoticon use. By extending our analysisto Facebook, we also assess the generalisability of ourresults and discuss differences and similarities betweenthese two sites.