Not enough data to create a plot.
Try a different view from the menu above.
Information Technology
Dimensions of Self-Expression in Facebook Status Updates
Kramer, Adam D. I. (Facebook, Inc.) | Chung, Cindy K. (The University of Texas at Austin)
We describe the dimensions along which Facebook users tend to express themselves via status updates using the semi-automated text analysis approach, the Meaning Extraction Method (MEM). First, we examined dimensions of self-expression in all status updates from a sample of four million Facebook users from four English-speaking countries (the United States, Canada, the United Kingdom, and Australia) in order to examine how these countries vary in their self-expressions. All four countries showed a basic three-component structure, indicating that the medium is a stronger influence than country characteristics or demographics on how people use Facebook status updates. In each country, people vary in terms of the extent to which they use Informal Speech, share Positive Events, and discuss School in their Facebook status updates. Together, these factors tell us how users differ in their self-expression, and thus illustrate meaningful use cases for the product: Talking about what’s going on tends to be positive, and people vary in terms of the extent to which their status updates are short, slangy emotional expressions and topics regarding school. The specific words that define these factors showed subtle differences across countries: The use of profanity indicates fewer school words (but only in Australia), whereas the UK shows greater use of slang terms (rather than profanity) when speaking informally. The MEM also identified English-language dialects as a meaningful dimension along which the countries varied. In sum, beyond simply indicating topicality of posts, this study provides insight into how status updates are used for self-expression. We discuss several theoretical frameworks that could produce these results, and more broadly discuss the generation of theoretical frameworks from wholly empirical data (such as naturalistic Internet speech) using the MEM.
Hierarchical Bayesian Models for Latent Attribute Detection in Social Media
Rao, Delip (Johns Hopkins University) | Paul, Michael (Johns Hopkins University) | Fink, Clay (Johns Hopkins University) | Yarowsky, David (Johns Hopkins University) | Oates, Timothy (University of Maryland Baltimore County) | Coppersmith, Glen (JHU Human Language Technology Center of Excellence)
We present several novel minimally-supervised models for detecting latent attributes of social media users, with a focus on ethnicity and gender. Previouswork on ethnicity detection has used coarse-grained widely separated classes of ethnicity and assumed the existence of large amounts of training data such as the US census, simplifying the problem. Instead, we examine content generated by users in addition to name morpho-phonemics to detect ethnicity and gender. Further, weaddress this problem in a challenging setting where the ethnicity classes are more fine grained -- ethnicity classes in Nigeria -- and with very limited training data.
GlobalIdentifier: Unexpected Personal Social Content with Data on the Web
Paradesi, Sharon (Massachusetts Institute of Technology) | Shih, Fuming (Massachusetts Institute of Technology)
The past year has seen a growing public awareness of the privacy risks of social networking through personal information that people voluntarily disclose. A spotlight has accordingly been turned on the disclosure policies of social networking sites and on mechanisms for restricting access to personal information on Facebook and other sites. But this is not sufficient to address privacy concerns in a world where Web-based data mining tools can let anyone infer information about others by combining data from multiple sources. To illustrate this, we are building a demonstration data miner, GlobalInferencer, that makes inferences about an individual?s lifestyle and other behavior. GlobalInferencer uses linked data technology to perform unified searches across Facebook, Flickr, and public data sites. It demonstrates that controlling access to personal information on individual social networking sites is not an adequate framework for protecting privacy, or even for supporting valid inferencing. In addition to access restrictions, there must be mechanisms for maintaining the provenance of information combined from multiple sources, for revealing the context within which information is presented, and for respecting the accountability that determines how information should be used.
The Effect of Mobile Platforms on Twitter Content Generation
Perreault, Mathieu (McGill University) | Ruths, Derek (McGill University)
The increased popularity of feature-rich mobile devices in recent years has enabled widespread consumption and production of social media content via mobile devices. Because mobile devices and mobile applications change context within which an individual generates and consumes microblog content, we might expect microblogging behavior to differ depending on whether the user is using a mobile device. To our knowledge, little has been established about what, if any, effects such mobile interfaces have on microblogging. In this paper, we investigate this question within the context of Twitter, among the most popular microblogging platforms. This work makes three specific contributions. First, we quantify the ways in which user profiles are effected by the mobile context: (1) the extent to which users tend to be either fully non-mobile or mobile and (2) the relative activity of the mobile Twitter community. Second, we assess the differences in content between mobile and non-mobile tweets (posts to the Twitter platform). Our results show that mobile platforms produce very different patterns of Twitter usage. As part of our analysis, we propose and apply a classification system for tweets. We consider this to be the third contribution of this work. While other classification systems have been proposed, ours is the first to permit the independent encoding of a tweet’s form, content, and intended audience. In this paper we apply this system to show how tweets differ between mobile and non-mobile contexts. However, because of its flexibility and breadth, the schema may be useful to researchers studying Twitter content in other contexts as well.
Structure and Reciprocity in Technology-Centered Q&A Communities
Jiang, Ming (University of Michigan) | Dong, Tao (University of Michigan) | Chang, Yung-Ju (University of Michigan)
In this paper we examine the network structure of the MythTV mailing list, an online technology Q&A user community, and we use time-series analysis techniques to study users’ reciprocity behavior in this community. We find that the amount of help users provide is strongly correlated to the amount of help they receive. Further, by conducting the Granger Causality test on the time series data of active users’ activity, we find that the amount of help given is actually the reason why one gets a lot of help. This finding corresponds to the concept of directed reciprocity in social networks and provides insights into social dynamics in technology-centered online communities.
Culture Matters: A Survey Study of Social Q&A Behavior
Yang, Jiang (University of Michigan) | Morris, Meredith Ringel (Microsoft Research) | Teevan, Jaime (Microsoft Research) | Adamic, Lada A. (University of Michigan) | Ackerman, Mark S. (University of Michigan)
Online social networking tools are used around the world by people to ask questions of their friends, because friends provide direct, reliable, contextualized, and interactive responses. However, although the tools used in different cultures for question asking are often very similar, the way they are used can be very different, reflecting unique inherent cultural characteristics. We present the results of a survey designed to elicit cultural differences in people’s social question asking behaviors across the United States, the United Kingdom, China, and India. The survey received responses from 933 people distributed across the four countries who held similar job roles and were employed by a single organization. Responses included information about the questions they ask via social networking tools, and their motivations for asking and answering questions online. The results reveal culture as a consistently significant factor in predicting people’s social question and answer behavior. The prominent cultural differences we observe might be traced to people’s inherent cultural characteristics (e.g., their cognitive patterns and social orientation), and should be comprehensively considered in designing social search systems.
Does Bad News Go Away Faster?
Wu, Shaomei (Cornell University) | Tan, Chenhao (Cornell University) | Kleinberg, Jon (Cornell University) | Macy, Michael Walton (Cornell University)
We study the relationship between content and temporal dynamics of information on Twitter, focusing on the persistence of information. We compare two extreme temporal patterns in the decay rate of URLs embedded in tweets, defining a prediction task to distinguish between URLs that fade rapidly following their peak of popularity and those that fade more slowly. Our experiments show a strong association between the content and the temporal dynamics of information: given unigram features extracted from corresponding HTML webpages, a linear SVM classifier can predict the temporal pattern of URLs with high accuracy. We further explore the content of URLs in the two temporal classes using various textual analysis techniques (via LIWC and trend detection). We find that the rapidly-fading information contains significantly more words related to negative emotion, actions, and more complicated cognitive processes, whereas the persistent information contains more words related to positive emotion, leisure, and lifestyle.
Using Network Structure to Identify Groups in Virtual Worlds
Shah, Fahad (University of Central Florida) | Sukthankar, Gita Reese (University of Central Florida)
Humans are adept social animals capable of identifying friendship groups from a combination of linguistic cues and social network patterns. But what is more important, the content of what people say or their history of social interactions? Moreover, is it possible to identify whether people are part of a group with changing membership merely from general network properties, such as measures of centrality and latent communities? In this paper, we address the problem of identifying social groups from conversation data and present results of an empirical study on identifying groups in a virtual world. Virtual worlds are interesting because group membership is more shaped by common interests and less influenced by cultural and socio-economic factors. Our finding is that a combination of network measures is more predictive of group membership than language cues, and that both types of features can be combined to improve prediction.
Latent Set Models for Two-Mode Network Data
DuBois, Christopher (University of California, Irvine) | Foulds, James (University of California, Irvine) | Smyth, Padhraic (University of California, Irvine)
Two-mode networks are a natural representation for many kinds of relational data. These networks are bipartite graphs consisting of two distinct sets ("modes") of entities. For example, one can model multiple recipient email data as a two-mode network of (a) individuals and (b) the emails that they send or receive. In this work we present a statistical model for two-mode network data which posits that individuals belong to latent sets and that the members of a particular set tend to co-appear. We show how to infer these latent sets from observed data using a Markov chain Monte Carlo inference algorithm. We apply the model to the Enron email corpus, using it to discover interpretable latent structure as well as evaluating its predictive accuracy on a missing data task. Extensions to the model are discussed that incorporate additional side information such as the email's sender or text content, further improving the accuracy of the model.
Location3: How Users Share and Respond to Location-Based Data on Social
Chang, Jonathan (Facebook) | Sun, Eric (Facebook)
In August 2010 Facebook launched Places, a location-based service that allows users to check into points of interest and share their physical whereabouts with friends. The friends who see these events in their News Feed can then respond to these check-ins by liking or commenting on them. These data consisting of the places people go and how their friends react to them are a rich, novel dataset. In this paper we first analyze this dataset to understand the factors that influence where users check in, including previous check-ins, similarity to other places, where their friends check in, time of day, and demographics. We show how these factors can be used to build a predictive model of where users will check in next. Then we analyze how users respond to their friends’ check-ins and which factors contribute to users liking or commenting on them. We show how this can be used to improve the ranking of check-in stories, ensuring that users see only the most relevant updates from their friends and ensuring that businesses derive maximum value from check-ins at their establishments. Finally, we construct a model to predict friendship based on check-in count and show that cocheck-ins has a statistically significant effect on friendship.