Goto

Collaborating Authors

 Asia


Beyond Trending Topics: Real-World Event Identification on Twitter

AAAI Conferences

User-contributed messages on social media sites such as Twitter have emerged aspowerful, real-time means of information sharing on the Web. These short messages tend to reflect a variety of events in real time, making Twitter particularly well suited as a source of real-time event content. In this paper, we explore approaches for analyzing the stream of Twitter messages to distinguish between messages about real-world events andnon-event messages. Our approach relies on a rich family of aggregatestatistics of topically similar message clusters. Large-scale experiments over millions of Twitter messages show the effectiveness of our approach for surfacing real-world event content on Twitter.


Culture Matters: A Survey Study of Social Q&A Behavior

AAAI Conferences

Online social networking tools are used around the world by people to ask questions of their friends, because friends provide direct, reliable, contextualized, and interactive responses. However, although the tools used in different cultures for question asking are often very similar, the way they are used can be very different, reflecting unique inherent cultural characteristics. We present the results of a survey designed to elicit cultural differences in people’s social question asking behaviors across the United States, the United Kingdom, China, and India. The survey received responses from 933 people distributed across the four countries who held similar job roles and were employed by a single organization. Responses included information about the questions they ask via social networking tools, and their motivations for asking and answering questions online. The results reveal culture as a consistently significant factor in predicting people’s social question and answer behavior. The prominent cultural differences we observe might be traced to people’s inherent cultural characteristics (e.g., their cognitive patterns and social orientation), and should be comprehensively considered in designing social search systems.


Participation Maximization Based on Social Influence in Online Discussion Forums

AAAI Conferences

In online discussion forums, users are more motivated to take part in discussions when observing other users’ participation—the effect of social influence among forum users. In this paper, we study how to utilize social influence for increasing the overall forum participation. To this end, we propose a mechanism to maximize user influence and boost participation by displaying forum threads to users. We formally define the participation maximization problem, and show that it is a special instance of the social welfare maximization problem with submodular utility functions and it is NP-hard. However, generic approximation algorithms is impracticable for real-world forums due to time complexity. Thus we design a heuristic algorithm, named Thread Allocation Based on Influence (TABI), to tackle the problem. Through extensive experiments using a dataset from a real-world online forum, we demonstrate that TABI consistently outperforms all other algorithms in maximizing participation. The results of this work demonstrates that current recommender systems can be made more effective by considering future influence propagations. The problem of participation maximization based on influence also opens a new direction in the study of social influence.


Differential Adaptive Diffusion: Understanding Diversity and Learning whom to Trust in Viral Marketing

AAAI Conferences

Viral marketing mechanisms use the existing social network between customers to spread information about products and encourage product adoption. Existing viral marketing models focus on the dynamics of the diffusion process, however they typically: (a) only consider a single product campaign and (b) fail to model the evolution of the social network, as the trust between individuals changes over time, during the course of multiple campaigns. In this work, we propose an adaptive viral marketing model which captures: (1) multiple different product campaigns, (2) the diversity in customer preferences among different product categories, and (3) changing confidence in peers’ recommendations over time. By applying our model to a real-world network extracted from the Digg social news website, we provide insights into the effects of network dynamics on the different products’ adoption. Our experiments show that our proposed model outperforms earlier nonadaptive diffusion models in predicting future product adoptions. We also show how this model can be used to explore new viral marketing strategies that are more successful than classic strategies which ignore the dynamic nature of social networks.


An Assessment of Intrinsic and Extrinsic Motivation on Task Performance in Crowdsourcing Markets

AAAI Conferences

Crowdsourced labor markets represent a powerful new paradigm for accomplishing work. Understanding the motivating factors that lead to high quality work could have significant benefits. However, researchers have so far found that motivating factors such as increased monetary reward generally increase workers’ willingness to accept a task or the speed at which a task is completed, but do not improve the quality of the work. We hypothesize that factors that increase the intrinsic motivation of a task – such as framing a task as helping others – may succeed in improving output quality where extrinsic motivators such as increased pay do not. In this paper we present an experiment testing this hypothesis along with a novel experimental design that enables controlled experimentation with intrinsic and extrinsic motivators in Amazon’s Mechanical Turk, a popular crowdsourcing task market. Results suggest that intrinsic motivation can indeed improve the quality of workers’ output, confirming our hypothesis. Furthermore, we find a synergistic interaction between intrinsic and extrinsic motivators that runs contrary to previous literature suggesting “crowding out” effects. Our results have significant practical and theoretical implications for crowd work.


Detecting and Tracking Political Abuse in Social Media

AAAI Conferences

We study astroturf political campaigns on microblogging platforms: politically-motivated individuals and organizations that use multiple centrally-controlled accounts to create the appearance of widespread support for a candidate or opinion. We describe a machine learning framework that combines topological, content-based and crowdsourced features of information diffusion networks on Twitter to detect the early stages of viral spreading of political misinformation.  We present promising preliminary results with better than 96% accuracy in the detection of astroturf content in the run-up to the 2010 U.S. midterm elections.


A Machine Learning Approach to Twitter User Classification

AAAI Conferences

This paper addresses the task of user classification in social media, with an application to Twitter. We automatically infer the values of user attributes such as political orientation or ethnicity by leveraging observable information such as the user behavior, network structure and the linguistic content of the user’s Twitter feed. We employ a machine learning approach which relies on a comprehensive set of features derived from such user information. We report encouraging experimental results on 3 tasks with different characteristics: political affiliation detection, ethnicity identification and detecting affinity for a particular business. Finally, our analysis shows that rich linguistic features prove consistently valuable across the 3 tasks and show great promise for additional user classification needs.


Generate Adjective Sentiment Dictionary for Social Media Sentiment Analysis Using Constrained Nonnegative Matrix Factorization

AAAI Conferences

Although sentiment analysis has attracted a lot of research, little work has been done on social media data compared to product and movie reviews. This is due to the low accuracy that results from the more informal writing seen in social media data. Currently, most of sentiment analysis tools on social media choose the lexicon-based approach instead of the machine learning approach because the latter requires the huge challenge of obtaining enough human-labeled training data for extremely large-scale and diverse social opinion data. The lexicon-based approach requires a sentiment dictionary to determine opinion polarity. This dictionary can also provide useful features for any supervised learning method of the machine learning approach. However, many benchmark sentiment dictionaries do not cover the many informal and spoken words used in social media. In addition, they are not able to update frequently to include newly generated words online. In this paper, we present an automatic sentiment dictionary generation method, called Constrained Symmetric Nonnegative Matrix Factorization (CSNMF) algorithm, to assign polarity scores to each word in the dictionary, on a large social media corpus — digg.com. Moreover, we will demonstrate our study of Amazon Mechanical Turk (AMT) on social media word polarity, using both the human-labeled dictionaries from AMT and the General Inquirer Lexicon to compare our generated dictionary with. In our experiment, we show that combining links from both WordNet and the corpus to generate sentiment dictionaries does outperform using only one of them, and the words with higher sentiment scores yield better precision. Finally, we conducted a lexicon-based sentiment analysis on human-labeled social comments using our generated sentiment dictionary to show the effectiveness of our method.


Extracting Meta Statements from the Blogosphere

AAAI Conferences

Information extraction systems have been recently proposed for organizing and exploring content in large online text corpora as information networks . In such networks, the nodes are named entities (e.g., people, organizations) while the edges correspond to statements indicating relations among such entities. To date, such systems extract rather primitive networks, capturing only those relations which are expressed by direct statements. In many applications, it is useful to also extract more subtle relations which are often expressed as meta statements in the text. These can, for instance provide the context for a statement (e.g., “Google acquired YouTube on October 2006”), or repercussion about a statement (e.g., “The US condemned Russia’s invasion of Georgia”). In this work, we report on a system for extracting relations expressed in both direct statements as well as in meta statements. We propose a method based on Conditional Random Fields that explores syntactic features to extract both kinds of statements seamlessly. We follow the Open Information Extraction paradigm, where a classifier is trained to recognize any type of relation instead of specific ones. Finally, our results show substantial improvements over a state-of-the-art information extraction system, both in terms of accuracy and, especially, recall.


Task Specialization in Social Production Communities: The Case of Geographic Volunteer Work

AAAI Conferences

In social production communities, users' individual and collective efforts lead to the creation of valuable resources — cf. Wikipedia, Open Street Map, and Reddit. Contributors to such communities often specialize in the tasks they choose to do. We found evidence for specialization by work type in Cyclopath, a geographic wiki for bicyclists -- most users edit a single type of map feature, such as points of interest or roads and trails. We also saw a user lifecycle effect: as users gain experience, they specialize in editing roads and trails. Our findings suggest more effective ways to organize social production interfaces, compose units of work, and match them to users who want to help.