Not enough data to create a plot.
Try a different view from the menu above.
Information Technology
Trust Propagation with Mixed-Effects Models
Overgoor, Jan (Stanford University) | Wulczyn, Ellery (Stanford University) | Potts, Christopher (Stanford University)
Web-based social networks typically use public trust systems to facilitate interactions between strangers. These systems can be corrupted by misleading information spread under the cover of anonymity, or exhibit a strong bias towards positive feedback, originating from the fear of reciprocity. Trust propagation algorithms seek to overcome these shortcomings by inferring trust ratings between strangers from trust ratings between acquaintances and the structure of the network that connects them. We investigate a trust propagation algorithm that is based on user triads where the trust one user has in another is predicted based on an intermediary user. The propagation function can be applied iteratively to propagate trust along paths between a source user and a target user. We evaluate this approach using the trust network of the CouchSurfing community, which consists of 7.6M trust-valued edges between 1.1M users. We show that our model out-performs one that relies only on the trustworthiness of the target user (a kind of public trust system). In addition, we show that performance is significantly improved by bringing in user-level variability using mixed-effects regression models.
Evaluating Real-Time Search over Tweets
McCullough, Dean (National Institute of Standards and Technology) | Lin, Jimmy (University of Maryland) | Macdonald, Craig (University of Glasgow) | Ounis, Iadh (University of Glasgow) | McCreadie, Richard (University of Glasgow)
Twitter offers a phenomenal platform for the social sharing of information. We describe new resources that have been created in the context of the Text Retrieval Conference (TREC) to support the academic study of Twitter as a real-time information source. We formalize an information seeking task — real-time search — and offer a methodology for measuring system effectiveness. At the TREC 2011 Microblog Track, 58 research groups participated in the first ever evaluation of this task. We present data from the effort to illustrate and support our methodology.
Around the Water Cooler: Shared Discussion Topics and Contact Closeness in Social Search
Komanduri, Saranga (Carnegie Mellon University) | Fang, Lujun (University of Michigan at Ann Arbor) | Huffaker, David (Google, Inc) | Staddon, Jessica (Google, Inc)
Search engines are now augmenting search results with social annotations, i.e., endorsements from users’ social network contacts. However, there is currently a dearth of published research on the effects of these annotations on user choice. This work investigates two research questions associated with annotations: 1) do some contacts affect user choice more than others, and 2) are annotations relevant across various information needs. We conduct a controlled experiment with 355 participants, using hypothetical searches and annotations, and elicit users’ choices. We find that domain contacts are preferred to close contacts, and this preference persists across a variety of information needs. Further, these contacts need not be experts and might be identified easily from conversation data.
OMG, I Have to Tweet that! A Study of Factors that Influence Tweet Rates
Kıcıman, Emre (Microsoft Research)
Many studies have shown that social data such as tweets are a rich source of information about the real-world including, for example, insights into health trends. A key limitation when analyzing Twitter data, however, is that it depends on people self-reporting their own behaviors and observations. In this paper, we present a large-scale quantitative analysis of some of the factors that influence self-reporting bias. In our study, we compare a year of tweets about weather events to ground-truth knowledge about actual weather occurrences. For each weather event we calculate how extreme, how expected, and how big a change the event represents. We calculate the extent to which these factors can explain the daily variations in tweet rates about weather events. We find that we can build global models that take into account basic weather information, together with extremeness, expectation and change calculations to account for over 40% of the variability in tweet rates. We build location-specific (i.e., a model per each metropolitan area) models that account for an average of 70% of the variability in tweet rates.
Social Media Is NOT that Bad! The Lexical Quality of Social Media
Rello, Luz (Universitat Pompeu Fabra) | Baeza-Yates, Ricardo (Yahoo! Research)
There is a strong correlation between spelling errors and web text content quality. Using our lexical quality measure, based in a small corpus of spelling errors, we present an estimation of the lexical quality of the main Social Media sites. This paper presents an updated and complete analysis of the lexical quality of Social Media written in English and Spanish, including how lexical quality changes in time.
Modeling Spread of Disease from Social Interactions
Sadilek, Adam (University of Rochester) | Kautz, Henry (University of Rochester) | Silenzio, Vincent (University of Rochester)
Research in computational epidemiology to date has concentrated on coarse-grained statistical analysis of populations, often synthetic ones. By contrast, this paper focuses on fine-grained modeling of the spread of infectious diseases throughout a large real-world social network. Specifically, we study the roles that social ties and interactions between specific individuals play in the progress of a contagion. We focus on public Twitter data, where we find that for every health-related message there are more than 1,000 unrelated ones. This class imbalance makes classification particularly challenging. Nonetheless, we present a framework that accurately identifies sick individuals from the content of online communication. Evaluation on a sample of 2.5 million geo-tagged Twitter messages shows that social ties to infected, symptomatic people, as well as the intensity of recent co-location, sharply increase one's likelihood of contracting the illness in the near future. To our knowledge, this work is the first to model the interplay of social activity, human mobility, and the spread of infectious disease in a large real-world population. Furthermore, we provide the first quantifiable estimates of the characteristics of disease transmission on a large scale without active user participation---a step towards our ability to model and predict the emergence of global epidemics from day-to-day interpersonal interactions.
Facebook and Privacy: The Balancing Act of Personality, Gender, and Relationship Currency
Quercia, Daniele (University of Cambridge) | Casas, Diego Las (Universidade Federal de Minas Gerais) | Pesce, Joao Paulo (Universidade Federal de Minas Gerais) | Stillwell, David (University of Cambridge) | Kosinski, Michal (University of Cambridge) | Almeida, Virgilio (Universidade Federal de Minas Gerais) | Crowcroft, Jon (University of Cambridge)
Social media profiles are telling examples of the everyday need for disclosure and concealment. The balance between concealment and disclosure varies across individuals, and personality traits might partly explain this variability. Experimental findings on the relationship between information disclosure and personality have been so far inconsistent. We thus study this relationship anew with 1,313 Facebook users in the United States using two personality tests: the big five personality test and the self-monitoring test. We model the process of information disclosure in a principled way using Item Response Theory and correlate the resulting user disclosure scores with personality traits. We find a correlation with the trait of Openness and observe gender effects, in that, men and women share equal amount of private information, but men tend to make it more publicly available, well beyond their social circles. Interestingly, geographic (e.g., residence, hometown) and work-related information is used as relationship currency, in that, it is selectively shared with social contacts and is rarely shared with the Facebook community at large.
More of a Receiver Than a Giver: Why Do People Unfollow in Twitter?
Kwak, Haewoon (Telefonica Research) | Moon, Sue (KAIST) | Lee, Wonjae (KAIST)
We propose a logistic regression model taking into account two analytically different sets of factors–structure and action. The factors include individual, dyadic, and triadic properties between ego and alter whose tie breakup is under consideration. From the fitted model using a large-scale data, we discover 5 structural and 7 actional variables to have significant explanatory power for unfollow. One unique finding from our quantitative analysis is that people appreciate receiving acknowledgements from others even in virtually unilateral communication relationships and are less likely to unfollow them: people are more of a receiver than a giver.
Transductive Learning for Real-Time Twitter Search
Zhang, Xin (Graduate University of Chinese Academy of Sciences) | He, Ben (Graduate University of Chinese Academy of Sciences) | Luo, Tiejian (Graduate University of Chinese Academy of Sciences)
Recency is an important dimension of relevance for real-time Twitter search as users tend to be interested in fresh news and events. By incorporating various sources of evidence, the application of learning to rank (LTR) algorithms to real-time Twitter search has shown beneficial in finding not only relevant, but also recent tweets in response to given queries. However, the potential effectiveness brought by LTR may not have been fully exploited due to the lack of labeled data available for properly learning a ranking model, since human labels are expensive in real-world applications. To this end, this paper proposes a transductive algorithm that incrementally aggregate the labeled tweets through an iterative process. Experimental results on the standard Tweets11 dataset show that our approach is able to outperform strong baselines without the use of human labels.
Tweetin' in the Rain: Exploring Societal-Scale Effects of Weather on Mood
Hannak, Aniko (Northeastern University) | Anderson, Eric (Northeastern University) | Barrett, Lisa Feldman (Northeastern University) | Lehmann, Sune (Technical University of Denmark) | Mislove, Alan (Northeastern University) | Riedewald, Mirek (Northeastern University)
There has been significant recent interest in using the aggregate sentiment from social media sites to understand and predict real-world phenomena. However, the data from social media sites also offers a unique and — so far — unexplored opportunity to study the impact of external factors on aggregate sentiment, at the scale of a society. Using a Twitter-specific sentiment extraction methodology, we the explore patterns of sentiment present in a corpus of over 1.5 billion tweets. We focus primarily on the effect of the weather and time on aggregate sentiment, evaluating how clearly the well-known individual patterns translate into population-wide patterns. Using machine learning techniques on the Twitter corpus correlated with the weather at the time and location of the tweets, we find that aggregate sentiment follows distinct climate, temporal, and seasonal patterns.