Goto

Collaborating Authors

 Country


Memes Online: Extracted, Subtracted, Injected, and Recollected

AAAI Conferences

Social media is playing an increasingly vital role in information dissemination. But with dissemination being more distributed, content often makes multiple hops, and consequently has opportunity to change. In this paper we focus on content that should be changing the least, namely quoted text. We find changes to be frequent, with their likelihood depending on the authority of the copied source and the type of site that is copying. We uncover patterns in the rate of appearance of new variants, their length, and popularity, and develop a simple model that is able to capture them. These patterns are distinct from ones produced when all copies are made from the same source, suggesting that information is evolving as it is being processed collectively in online social media.


Differential Adaptive Diffusion: Understanding Diversity and Learning whom to Trust in Viral Marketing

AAAI Conferences

Viral marketing mechanisms use the existing social network between customers to spread information about products and encourage product adoption. Existing viral marketing models focus on the dynamics of the diffusion process, however they typically: (a) only consider a single product campaign and (b) fail to model the evolution of the social network, as the trust between individuals changes over time, during the course of multiple campaigns. In this work, we propose an adaptive viral marketing model which captures: (1) multiple different product campaigns, (2) the diversity in customer preferences among different product categories, and (3) changing confidence in peers’ recommendations over time. By applying our model to a real-world network extracted from the Digg social news website, we provide insights into the effects of network dynamics on the different products’ adoption. Our experiments show that our proposed model outperforms earlier nonadaptive diffusion models in predicting future product adoptions. We also show how this model can be used to explore new viral marketing strategies that are more successful than classic strategies which ignore the dynamic nature of social networks.


An Assessment of Intrinsic and Extrinsic Motivation on Task Performance in Crowdsourcing Markets

AAAI Conferences

Crowdsourced labor markets represent a powerful new paradigm for accomplishing work. Understanding the motivating factors that lead to high quality work could have significant benefits. However, researchers have so far found that motivating factors such as increased monetary reward generally increase workers’ willingness to accept a task or the speed at which a task is completed, but do not improve the quality of the work. We hypothesize that factors that increase the intrinsic motivation of a task – such as framing a task as helping others – may succeed in improving output quality where extrinsic motivators such as increased pay do not. In this paper we present an experiment testing this hypothesis along with a novel experimental design that enables controlled experimentation with intrinsic and extrinsic motivators in Amazon’s Mechanical Turk, a popular crowdsourcing task market. Results suggest that intrinsic motivation can indeed improve the quality of workers’ output, confirming our hypothesis. Furthermore, we find a synergistic interaction between intrinsic and extrinsic motivators that runs contrary to previous literature suggesting “crowding out” effects. Our results have significant practical and theoretical implications for crowd work.


Scalable Event-Based Clustering of Social Media Via Record Linkage Techniques

AAAI Conferences

We tackle the problem of grouping content available in social media applications such as Flickr, Youtube, Panoramino etc. into clusters of documents describing the same event. This task has been referred to as event identification before. We present a new formalization of the event identification task as a record linkage problem and show that this formulation leads to a principled and highly efficient solution to the problem. We present results on two datasets derived from Flickr — last.fm and upcoming — comparing the results in terms of Normalized Mutual Information and F-Measure with respect to several baselines, showing that a record linkage approach outperforms all baselines as well as a state-of-the-art system. We demonstrate that our approach can scale to large amounts of data, reducing the processing time considerably compared to a state-of-the-art approach. The scalability is achieved by applying an appropriate blocking strategy and relying on a Single Linkage clustering algorithm which avoids the exhaustive computation of pairwise similarities.


Detecting and Tracking Political Abuse in Social Media

AAAI Conferences

We study astroturf political campaigns on microblogging platforms: politically-motivated individuals and organizations that use multiple centrally-controlled accounts to create the appearance of widespread support for a candidate or opinion. We describe a machine learning framework that combines topological, content-based and crowdsourced features of information diffusion networks on Twitter to detect the early stages of viral spreading of political misinformation.  We present promising preliminary results with better than 96% accuracy in the detection of astroturf content in the run-up to the 2010 U.S. midterm elections.


The Effect of Mobile Platforms on Twitter Content Generation

AAAI Conferences

The increased popularity of feature-rich mobile devices in recent years has enabled widespread consumption and production of social media content via mobile devices. Because mobile devices and mobile applications change context within which an individual generates and consumes microblog content, we might expect microblogging behavior to differ depending on whether the user is using a mobile device. To our knowledge, little has been established about what, if any, effects such mobile interfaces have on microblogging. In this paper, we investigate this question within the context of Twitter, among the most popular microblogging platforms. This work makes three specific contributions. First, we quantify the ways in which user profiles are effected by the mobile context: (1) the extent to which users tend to be either fully non-mobile or mobile and (2) the relative activity of the mobile Twitter community. Second, we assess the differences in content between mobile and non-mobile tweets (posts to the Twitter platform). Our results show that mobile platforms produce very different patterns of Twitter usage. As part of our analysis, we propose and apply a classification system for tweets. We consider this to be the third contribution of this work. While other classification systems have been proposed, ours is the first to permit the independent encoding of a tweet’s form, content, and intended audience. In this paper we apply this system to show how tweets differ between mobile and non-mobile contexts. However, because of its flexibility and breadth, the schema may be useful to researchers studying Twitter content in other contexts as well.


A Machine Learning Approach to Twitter User Classification

AAAI Conferences

This paper addresses the task of user classification in social media, with an application to Twitter. We automatically infer the values of user attributes such as political orientation or ethnicity by leveraging observable information such as the user behavior, network structure and the linguistic content of the user’s Twitter feed. We employ a machine learning approach which relies on a comprehensive set of features derived from such user information. We report encouraging experimental results on 3 tasks with different characteristics: political affiliation detection, ethnicity identification and detecting affinity for a particular business. Finally, our analysis shows that rich linguistic features prove consistently valuable across the 3 tasks and show great promise for additional user classification needs.


Generate Adjective Sentiment Dictionary for Social Media Sentiment Analysis Using Constrained Nonnegative Matrix Factorization

AAAI Conferences

Although sentiment analysis has attracted a lot of research, little work has been done on social media data compared to product and movie reviews. This is due to the low accuracy that results from the more informal writing seen in social media data. Currently, most of sentiment analysis tools on social media choose the lexicon-based approach instead of the machine learning approach because the latter requires the huge challenge of obtaining enough human-labeled training data for extremely large-scale and diverse social opinion data. The lexicon-based approach requires a sentiment dictionary to determine opinion polarity. This dictionary can also provide useful features for any supervised learning method of the machine learning approach. However, many benchmark sentiment dictionaries do not cover the many informal and spoken words used in social media. In addition, they are not able to update frequently to include newly generated words online. In this paper, we present an automatic sentiment dictionary generation method, called Constrained Symmetric Nonnegative Matrix Factorization (CSNMF) algorithm, to assign polarity scores to each word in the dictionary, on a large social media corpus — digg.com. Moreover, we will demonstrate our study of Amazon Mechanical Turk (AMT) on social media word polarity, using both the human-labeled dictionaries from AMT and the General Inquirer Lexicon to compare our generated dictionary with. In our experiment, we show that combining links from both WordNet and the corpus to generate sentiment dictionaries does outperform using only one of them, and the words with higher sentiment scores yield better precision. Finally, we conducted a lexicon-based sentiment analysis on human-labeled social comments using our generated sentiment dictionary to show the effectiveness of our method.


What's in a @name? How Name Value Biases Judgment of Microblog Authors

AAAI Conferences

Bias can be defined as selective favoritism exhibited by human beings when posed with a task of decision making across multiple options. Online communities present plenty of decision making opportunities to their users. Users exhibit biases in their attachments, voting and ratings and other tasks of decision making. We study bias amongst microblog users due to the value of an author's name. We describe the relationship between name value bias and number of followers, and cluster authors and readers based on patterns of bias they receive and exhibit, respectively. For authors we show that content from known names (e.g., @CNN) is rated artificially high, while content from unknown names is rated artificially low. For readers, our results indicate that there are two types: slightly biased, heavily biased. A subsequent analysis of Twitter author names revealed attributes of names that underlie this bias, including effects for gender, type of name (individual versus organization), and degree of topical relevance. We discuss how our work can be instructive to content distributors and search engines in leveraging and presenting microblog content.


The Prevalence of Political Discourse in Non-Political Blogs

AAAI Conferences

Though political theorists have emphasized the importance of political discussion in non-political spaces, past study of online political discussion has focused on primarily political websites. Using a random sample from Blogger.com, we find that 25% of all political posts are from blogs that post about politics less than 20% of the time, because the vast majority of blogs post about politics some of the time but infrequently. Far from being taboo topics in those non- political blogs, political posts got slightly more comments than non-political posts in those same blogs, and the comments overwhelmingly engage the political topics of the post, mostly agreeing but frequently disagreeing as well. We argue that non-political spaces devoted primarily to personal diaries, hobbies, and other topics represent a substantial place of online political discussion and should be a site for further study.