Genre
Find Me the Right Content! Diversity-Based Sampling of Social Media Spaces for Topic-Centric Search
Choudhury, Munmun De (Rutgers, The State University of New Jersey) | Counts, Scott (Microsoft Research) | Czerwinski, Mary (Microsoft Research)
Social media and networking websites, such as Twitter and Facebook, generate large quantities of information and have become mechanisms for real-time content dissipation to users. An important question that arises is: how do we sample such social media information spaces in order to deliver relevant content on a topic to end users? Notice that these large-scale information spaces are inherently diverse, featuring a wide array of attributes such as location, recency, degree of diffusion effects in the network and so on. Naturally, for the end user, different levels of diversity in social media content can significantly impact the information consumption experience: low diversity can provide focused content that may be simpler to understand, while high diversity can increase breadth in the exposure to multiple opinions and perspectives. Hence to address our research question, we turn to diversity as a core concept in our proposed sampling methodology. Here we are motivated by ideas in the "compressive sensing" literature and utilize the notion of sparsity in social media information to represent such large spaces via a small number of basis components. Thereafter we use a greedy iterative clustering technique on this transformed space to construct samples matching a desired level of diversity. Based on Twitter Firehose data, we demonstrate quantitatively that our method is robust, and performs better than other baseline techniques over a variety of trending topics. In a user study, we further show that users find samples generated by our method to be more interesting and subjectively engaging compared to techniques inspired by state-of-the-art systems, with improvements in the range of 15--45%.
Rating Friends Without Making Enemies
Adamic, Lada A. (University of Michigan) | Lauterbach, Debra (University of Michigan) | Teng, Chun-Yuen (University of Michigan) | Ackerman, Mark (University of Michigan)
As online social networks expand their role beyond maintaining existing relationships, they may look to more faceted ratings to support the formation of new connections between their users. Our study focuses on one community employing faceted ratings, CouchSurfing.org, and combines data analysis of ratings, a large-scale survey, and in-depth interviews. In order to understand the ratings, we revisit the notions of friendship and trust and uncover an asymmetry: close friendship includes trust, but high levels of trust can be achieved without close friendship. To users, providing faceted ratings presents challenges, including differentiating and quantifying inherently subjective feelings such as friendship and trust, concern over a friend's reaction to a rating, and knowledge of how ratings can affect others' reputations. One consequence of these issues is the near absence of negative feedback, even though a small portion of actual experiences and privately held ratings are negative. We show how users take this into account when formulating and interpreting ratings, and discuss designs that could encourage more balanced feedback.
Modeling Public Mood and Emotion: Twitter Sentiment and Socio-Economic Phenomena
Bollen, Johan (Indiana University) | Mao, Huina (Indiana University) | Pepe, Alberto (Harvard University)
We perform a sentiment analysis of all tweets published on the microblogging platform Twitter in the second half of 2008. We use a psychometric instrument to extract six mood states (tension, depression, anger, vigor, fatigue, confusion) from the aggregated Twitter content and compute a six-dimensional mood vector for each day in the timeline. We compare our results to a record of popular events gathered from media and sources. We find that events in the social, political, cultural and economic sphere do have a significant, immediate and highly specific effect on the various dimensions of public mood. We speculate that large scale analyses of mood can provide a solid platform to model collective emotive trends in terms of their predictive value with regards to existing social as well as economic indicators.
Exploiting Semantic Annotations for Clustering Geographic Areas and Users in Location-based Social Networks
Noulas, Anastasios (University of Cambridge) | Scellato, Salvatore (University of Cambridge) | Mascolo, Cecilia (University of Cambridge) | Pontil, Massimiliano (University College London)
Location-Based Social Networks (LBSN) present so far the most vivid realization of the convergence of the physical and virtual social planes. In this work we propose a novel approach on modeling human activity and geographical areas by means of place categories. We apply a spectral clustering algorithm on areas and users of two metropolitan cities on a dataset sourced from the most vibrant LBSN, Foursquare. Our methodology allows the identification of user communities that visit similar categories of places and the comparison of urban neighborhoods within and across cities. We demonstrate how semantic information attached to places could be plausibly used as a modeling interface for applications such as recommender systems and digital tourist guides.
Asked and Answered: On Qualities and Quantities of Answers in Online Q&A Sites
Logie, John (University of Minnesota) | Weinberg, Joseph (University of Minnesota) | Harper, F. Maxwell (University of Minnesota) | Konstan, Joseph A. (University of Minnesota)
This paper builds upon several recent research efforts that have explored the nature and qualities of questions asked on these social Q&A sites by offering a focused examination of answers posted to three of the most popular Q&A sites. Specifically, this paper examines sets of answers responding to specific types of questions and explores the degree to which question types are predictive of answer quantity and answer quality. Blending qualitative and quantitative methods, the paper builds upon rich coding of a representative sets of real questions — drawn from Answerbag, (Ask) MetaFilter, and Yahoo! Answers — in order to better understand whether the explicit and implicit theories and predictions drawn from coding of these questions were borne out in the corresponding answer sets found on these sites. Quantitative findings include data underscoring the general overall success of social Q&A sites in producing answers that can satisfy the needs of those who pose questions. Additionally, this paper presents a predictive model that can anticipate the archival value of answers based on the category and qualities of questions asked. Qualitative findings include an analysis of the variation in responses to questions that are primarily seeking objective, grounded information relative to those seeking subjective opinions.
Modeling the Detection of Textual Cyberbullying
Dinakar, Karthik (Massachusetts Institute of Technology) | Reichart, Roi (Hebrew University of Jerusalem) | Lieberman, Henry (Massachusetts Institute of Technology)
The scourge of cyberbullying has assumed alarming proportions with an ever-increasing number of adolescents admitting to having dealt with it either as a victim or as a bystander. Anonymity and the lack of meaningful supervision in the electronic medium are two factors that have exacerbated this social menace. Comments or posts involving sensitive topics that are personal to an individual are more likely to be internalized by a victim, often resulting in tragic outcomes. We decompose the overall detection problem into detection of sensitive topics, lending itself into text classification sub-problems. We experiment with a corpus of 4500 YouTube comments, applying a range of binary and multiclass classifiers. We find that binary classifiers for individual labels outperform multiclass classifiers. Our findings show that the detection of textual cyberbullying can be tackled by building individual topic-sensitive classifiers.
Social Mechanics: An Empirically Grounded Science of Social Media
Lerman, Kristina (USC Information Sciences Institute) | Galstyan, Aram (USC Information Sciences Institute) | Steeg, Greg Ver (USC Information Sciences Institute) | Hogg, Tad (Hewlett-Packard)
What will social media sites of tomorrow look like? What behaviors will their interfaces enable? A major challenge for designing new sites that allow a broader range of user actions is the difficulty of extrapolating from experience with current sites without first distinguishing correlations from underlying causal mechanisms. The growing availability of data on user activities provides new opportunities to uncover correlations among user activity, contributed content and the structure of links among users. However, such correlations do not necessarily translate into predictive models. Instead, empirically grounded mechanistic models provide a stronger basis for establishing causal mechanisms and discovering the underlying statistical laws governing social behavior. We describe a statistical physics-based framework for modeling and analyzing social media and illustrate its application to the problems of prediction and inference. We hope these examples will inspire the research community to explore these methods to look for empirically valid causal mechanisms for the observed correlations.
TweetTrader.net: Leveraging Crowd Wisdom in a Stock Microblogging Forum
Sprenger, Timm Oliver (Technische Universität München)
TweetTrader.net is a stock microblogging forum that leverages the wisdom of crowds to aggregate the information contained in stock-related tweets. Based on insights from academic research on stock microblogs, the application integrates inputs from text classification, user voting and a proprietary Stock Game in order to extract the sentiment (i.e., the bullishness) of online investors with respect to all publicly traded companies of the S&P 500.
Personalized Landmark Recommendation Based on Geotags from Photo Sharing Sites
Shi, Yue (Delft University of Technology) | Serdyukov, Pavel (Yandex) | Hanjalic, Alan (Delft University of Technology) | Larson, Martha (Delft University of Technology)
Geotagged photos of users on social media sites provide abundant location-based data, which can be exploited for various location-based services, such as travel recommendation. In this paper, we propose a novel approach to a new application, i.e., personalized landmark recommendation based on users’ geotagged photos. We formulate the landmark recommendation task as a collaborative filtering problem, for which we propose a category-regularized matrix factorization approach that integrates both user-landmark preference and category-based landmark similarity. We collected geotagged photos from Flickr and landmark categories from Wikipedia for our experiments. Our experimental results demonstrate that the proposed approach outperforms popularity-based landmark recommendation and a basic matrix factorization approach in recommending personalized landmarks that are less visited by the population as a whole.
“Dancing with the Stars,” NBA Games, Politics: An Exploration of Twitter Users’ Response to Events
Popescu, Ana-Maria (Yahoo! Labs) | Pennacchiotti, Marco (Yahoo! Labs)
Microblogging services such as Twitter offer great opportunities for analyzing the reactions of a wide audience with respect to current events. In this paper, we explore the correlation between types of user engagement and events centered around celebrities (e.g., personal or professional events involving Actors, Musicians, Politicians, Athletes).