Goto

Collaborating Authors

 Country


Social Mechanics: An Empirically Grounded Science of Social Media

AAAI Conferences

What will social media sites of tomorrow look like? What behaviors will their interfaces enable? A major challenge for designing new sites that allow a broader range of user actions is the difficulty of extrapolating from experience with current sites without first distinguishing correlations from underlying causal mechanisms. The growing availability of data on user activities provides new opportunities to uncover correlations among user activity, contributed content and the structure of links among users. However, such correlations do not necessarily translate into predictive models. Instead, empirically grounded mechanistic models provide a stronger basis for establishing causal mechanisms and discovering the underlying statistical laws governing social behavior. We describe a statistical physics-based framework for modeling and analyzing social media and illustrate its application to the problems of prediction and inference. We hope these examples will inspire the research community to explore these methods to look for empirically valid causal mechanisms for the observed correlations.


Modeling Public Mood and Emotion: Twitter Sentiment and Socio-Economic Phenomena

AAAI Conferences

We perform a sentiment analysis of all tweets published on the microblogging platform Twitter in the second half of 2008. We use a psychometric instrument to extract six mood states (tension, depression, anger, vigor, fatigue, confusion) from the aggregated Twitter content and compute a six-dimensional mood vector for each day in the timeline. We compare our results to a record of popular events gathered from media and sources. We find that events in the social, political, cultural and economic sphere do have a significant, immediate and highly specific effect on the various dimensions of public mood. We speculate that large scale analyses of mood can provide a solid platform to model collective emotive trends in terms of their predictive value with regards to existing social as well as economic indicators.


A Bootstrapping Approach to Identifying Relevant Tweets for Social TV

AAAI Conferences

Manufacturers of TV sets have recently started adding social media features to their products. Some of these products display microblogging messages relevant to the TV show which the user is currently watching. However, such systems suffer from low precision and recall when they use the title of the show to search for relevant messages. Titles of some popular shows such as Lost or Survivor are highly ambiguous, resulting in messages unrelated to the show. Thus, there is a need to develop filtering algorithms that can achieve both high precision and recall. Filtering microblogging messages for Social TV poses several challenges, including lack of training data, lack of proper grammar and capitalization, lack of context due to text sparsity, etc. We describe a bootstrapping algorithm which uses a small manually labeled dataset, a large dataset of unlabeled messages, and some domain knowledge to derive a high precision classifier that can successfully filter microblogging messages which discuss television shows. The classifier is designed to generalize to TV shows which were not part of the training set. The algorithm achieves high precision on our two test datasets and successfully generalizes to unseen television shows. Furthermore, it compares favorably to a text classifier specifically trained on the television shows used for testing.


An Assessment of Intrinsic and Extrinsic Motivation on Task Performance in Crowdsourcing Markets

AAAI Conferences

Crowdsourced labor markets represent a powerful new paradigm for accomplishing work. Understanding the motivating factors that lead to high quality work could have significant benefits. However, researchers have so far found that motivating factors such as increased monetary reward generally increase workers’ willingness to accept a task or the speed at which a task is completed, but do not improve the quality of the work. We hypothesize that factors that increase the intrinsic motivation of a task – such as framing a task as helping others – may succeed in improving output quality where extrinsic motivators such as increased pay do not. In this paper we present an experiment testing this hypothesis along with a novel experimental design that enables controlled experimentation with intrinsic and extrinsic motivators in Amazon’s Mechanical Turk, a popular crowdsourcing task market. Results suggest that intrinsic motivation can indeed improve the quality of workers’ output, confirming our hypothesis. Furthermore, we find a synergistic interaction between intrinsic and extrinsic motivators that runs contrary to previous literature suggesting “crowding out” effects. Our results have significant practical and theoretical implications for crowd work.


Rating Friends Without Making Enemies

AAAI Conferences

As online social networks expand their role beyond maintaining existing relationships, they may look to more faceted ratings to support the formation of new connections between their users. Our study focuses on one community employing faceted ratings, CouchSurfing.org, and combines data analysis of ratings, a large-scale survey, and in-depth interviews. In order to understand the ratings, we revisit the notions of friendship and trust and uncover an asymmetry: close friendship includes trust, but high levels of trust can be achieved without close friendship. To users, providing faceted ratings presents challenges, including differentiating and quantifying inherently subjective feelings such as friendship and trust, concern over a friend's reaction to a rating, and knowledge of how ratings can affect others' reputations. One consequence of these issues is the near absence of negative feedback, even though a small portion of actual experiences and privately held ratings are negative. We show how users take this into account when formulating and interpreting ratings, and discuss designs that could encourage more balanced feedback.


More Voices Than Ever? Quantifying Media Bias in Networks

AAAI Conferences

Social media, such as blogs, are often seen as democratic entities that allow more voices to be heard than the conventional mass or elite media. Some also feel that social media exhibits a balancing force against the arguably slanted elite media. A systematic comparison between social and mainstream media is necessary but challenging due to the scale and dynamic nature of modern communication. Here we propose empirical measures to quantify the extent and dynamics of social (blog) and mainstream (news) media bias. We focus on a particular form of bias--coverage quantity--as applied to stories about the 111th US Congress. We compare observed coverage of Members of Congress against a null model of unbiased coverage, testing for biases with respect to political party, popular front runners, regions of the country, and more. Our measures suggest distinct characteristics in news and blog media. A simple generative model, in agreement with data, reveals differences in the process of coverage selection between the two media.


Find Me the Right Content! Diversity-Based Sampling of Social Media Spaces for Topic-Centric Search

AAAI Conferences

Social media and networking websites, such as Twitter and Facebook, generate large quantities of information and have become mechanisms for real-time content dissipation to users. An important question that arises is: how do we sample such social media information spaces in order to deliver relevant content on a topic to end users? Notice that these large-scale information spaces are inherently diverse, featuring a wide array of attributes such as location, recency, degree of diffusion effects in the network and so on. Naturally, for the end user, different levels of diversity in social media content can significantly impact the information consumption experience: low diversity can provide focused content that may be simpler to understand, while high diversity can increase breadth in the exposure to multiple opinions and perspectives. Hence to address our research question, we turn to diversity as a core concept in our proposed sampling methodology. Here we are motivated by ideas in the "compressive sensing" literature and utilize the notion of sparsity in social media information to represent such large spaces via a small number of basis components. Thereafter we use a greedy iterative clustering technique on this transformed space to construct samples matching a desired level of diversity. Based on Twitter Firehose data, we demonstrate quantitatively that our method is robust, and performs better than other baseline techniques over a variety of trending topics. In a user study, we further show that users find samples generated by our method to be more interesting and subjectively engaging compared to techniques inspired by state-of-the-art systems, with improvements in the range of 15--45%.


Making Project Team Recommendations from Online Information Sources

AAAI Conferences

We are developing an Internet platform called MediaTeam that provides a marketplace connecting media content consumers to communities of media content creators. The platform is enabled by our method for automated assembly of virtual project teams. Media creators use the automated team assembler to quickly identify and team with collaborators. The team assembly platform factors in how the skills, work, and communication styles of team members complement each other into its team recommendation process. We are now testing the teaming and collaboration platforms with video creators and seek to launch by the summer.


Areca: Online Comparison of Research Results

AAAI Conferences

To experiment properly, scientists from many researchareas need large sets of real world data. Information re-trieval scientists for example often need to evaluate theiralgorithms on a dataset or a gold standard. The availabil-ity of these datasets often is insufficient and authors withthe same goal do not evaluate their approaches on thesame data. To make research results more transparentand comparable, we introduce Areca, an online portalfor sharing datasets and/or the results that were reachedwith the author’s algorithms on these datasets. Havingsuch an online comparison makes it easier to grasp thestate-of-the-art on certain tasks and drive research toimprove the results.


Supervised Topic Segmentation of Email Conversations

AAAI Conferences

We propose a graph-theoretic supervised topic segmentation model for email conversations which combines (i) lexical knowledge, (ii) conversational features, and (iii) topic features. We compare our results with the existing unsupervised models (i.e., LCSeg and LDA), and with their two extensions for email conversations (i.e., LCSeg+FQG and LDA+FQG) that not only use lexical information but also exploit finer conversation structure. Empirical evaluation shows that our supervised model is the best performer and achieves highest accuracy by combining the three different knowledge sources, where knowledge about the conversation has proved to be the most important indicator for segmenting emails.