Information Technology
The Pulse of News in Social Media: Forecasting Popularity
Bandari, Roja (University of California Los Angeles) | Asur, Sitaram (HP Labs) | Huberman, Bernardo A (HP Labs)
News articles are extremely time sensitive by nature. There is also intense competition among news items to propagate as widely as possible. Hence, the task of predicting the popularity of news items on the social web is both interesting and challenging. Prior research has dealt with predicting eventual online popularity based on early popularity. It is most desirable, however, to predict the popularity of items prior to their release, fostering the possibility of appropriate decision making to modify an article and the manner of its publication. In this paper, we construct a multi-dimensional feature space derived from properties of an article and evaluate the efficacy of these features to serve as predictors of online popularity. We examine both regression and classification algorithms and demonstrate that despite randomness in human behavior, it is possible to predict ranges of popularity on twitter with an overall 84% accuracy. Our study also serves to illustrate the differences between traditionally prominent sources and those immensely popular on the social web.
Exploring Social-Historical Ties on Location-Based Social Networks
Gao, Huiji (Arizona State University) | Tang, Jiliang (Arizona State University) | Liu, Huan (Arizona State University)
Location-based social networks (LBSNs) have become a popular form of social media in recent years. They provide location related services that allow users to "check-in'' at geographical locations and share such experiences with their friends. Millions of "check-in'' records in LBSNs contain rich information of social and geographical context and provide a unique opportunity for researchers to study user's social behavior from a spatial-temporal aspect, which in turn enables a variety of services including place advertisement, traffic forecasting, and disaster relief. In this paper, we propose a social-historical model to explore user's check-in behavior on LBSNs. Our model integrates the social and historical effects and assesses the role of social correlation in user's check-in behavior. In particular, our model captures the property of user's check-in history in forms of power-law distribution and short-term effect, and helps in explaining user's check-in behavior. The experimental results on a real world LBSN demonstrate that our approach properly models user's check-ins and shows how social and historical ties can help location prediction.
A Temporal Analysis of Posting Behavior in Social Media Streams
Lee, Bumsuk (The Catholic University of Korea)
In this work, we investigated the social media streams to understand their characteristics and their temporal aspects. We assumed that each blogger has different temporal preference for posting. To investigate this hypothesis, we analyzed a massive dataset, nearly 700,000 blog articles, with the consideration of two factors which are day of the week and time of the day. The comparison was done in manifold ways: Blogosphere vs. Twitter, commercial blogs vs. non-commercial blogs, and their individuals. We hope that this work provides a hint to develop a personalized system which can be used for the reduction of the system resources for pull/fetch technology.
Not All Moods Are Created Equal! Exploring Human Emotional States in Social Media
Choudhury, Munmun De (Microsoft Research, Redmond) | Counts, Scott (Microsoft Research, Redmond) | Gamon, Michael (Microsoft Research, Redmond)
Emotional states of individuals, also known as moods, are central to the expression of thoughts, ideas and opinions, and in turn impact attitudes and behavior. As social media tools are increasingly used by individuals to broadcast their day-to-day happenings, or to report on an external event of interest, understanding the rich โlandscapeโ of moods will help us better interpret and make sense of the behavior of millions of individuals. Motivated by literature in psychology, we study a popular representation of human mood landscape, known as the โcircumplex modelโ that characterizes affective experience through two dimensions: valence and activation. We identify more than 200 moods frequent on Twitter, through mechanical turk studies and psychology literature sources, and report on four aspects of mood expression: the relationship between (1) moods and usage levels, including linguistic diversity of shared content (2) moods and the social ties individuals form, (3) moods and amount of network activity of individuals, and (4) moods and participatory patterns of individuals such as link sharing and conversational engagement. Our results provide at-scale naturalistic assessments and extensions of existing conceptualizations of human mood in social media contexts.
Catching the Long-Tail: Extracting Local News Events from Twitter
Agarwal, Puneet (TCS Innovation Labs, Delhi) | Vaithiyanathan, Rajgopal (TCS Innovation Labs, Delhi) | Sharma, Saurabh (TCS Innovation Labs, Delhi) | Shroff, Gautam (TCS Innovation Labs, Delhi)
Twitter, used in 200 countries with over 250 milliontweets a day, is a rich source of local news from aroundthe world. Many events of local importance are first reportedon Twitter, including many that never reach newschannels. Further, there are often only a few tweetsreporting each such event, in contrast with the largervolumes that follow events of wider significance. Eventhough such events may be primarily of local importance,they can also be of critical interest to some specificbut possibly far flung entities: For example, a firein a supplierโs factory half-way around the world maybe of interest even from afar. In this paper we describehow this โlong tailโ of events can be detected in spite oftheir sparsity.We then extract and correlate informationfrom multiple tweets describing the same event. Ourgeneric architecture for converting a tweet-stream intoevent-objects uses locality sensitive hashing, classification,boosting, information extraction and clustering.Our results, based on millions of tweets monitored overmany months, appear to validate our approach and architecture:We achieved success-rates in the 80% rangefor event detection and 76% on event-correlation; we also reduced tweet-comparisons by 80% using LSH.
People Are Strange When You're a Stranger: Impact and Influence of Bots on Social Networks
Aiello, Luca Maria (Universita') | Deplano, Martina (degli Studi di Torino) | Schifanella, Rossano (Universita') | Ruffo, Giancarlo (degli Studi di Torino)
Bots are, for many Web and social media users, the source of many dangerous attacks or the carrier of unwanted messages, such as spam. Nevertheless, crawlers and software agents are a precious tool for analysts, and they are continuously executed to collect data or to test distributed applications. However, no one knows which is the real potential of a bot whose purpose is to control a community, to manipulate consensus, or to influence user behavior. It is commonly believed that the better an agent simulates human behavior in a social network, the more it can succeed to generate an impact in that community. We contribute to shed light on this issue through an online social experiment aimed to study to what extent a bot with no trust, no profile, and no aims to reproduce human behavior, can become popular and influential in a social media. Results show that a basic social probing activity can be used to acquire social relevance on the network and that the so-acquired popularity can be effectively leveraged to drive users in their social connectivity choices. We also register that our bot activity unveiled hidden social polarization patterns in the community and triggered an emotional response of individuals that brings to light subtle privacy hazards perceived by the user base.
The Emergence of Conventions in Online Social Networks
Kooti, Farshad (Max Planck Institute for Software Systems) | Yang, Haeryun (KAIST) | Cha, Meeyoung (KAIST) | Gummadi, Krishna P. (MPI-SWS) | Mason, Winter A. (Stevens Institute of Technology)
The way in which social conventions emerge in communities has been of interest to social scientists for decades. Here we report on the emergence of a particular social convention on Twitterโthe way to indicate a tweet is being reposted and to attribute the content to its source. Initially, different variations were invented and spread through the Twitter network. The inventors and early adopters were well-connected, active, core members of the Twitter community. The diffusion networks of these conventions were dense and highly clustered, so no single user was critical to the adoption of the conventions. Despite being invented at different times and having different adoption rates, only two variations came to be widely adopted. In this paper we describe this process in detail, highlighting insights and raising questions about how social conventions emerge.
More or Less: Amount of Personal Information Displayed in Social Network Site Profiles and Its Impact on Viewersโ Intentions to Socialize with the Profile Owner
Baruh, Lemi (Koc University) | Chisik, Yoram (University of Madeira) | Bisson, Christophe (Kadir Has University) | Senova, Basak (NOMAD)
This paper presents the results of an experiment that employed a 2 (low vs. high information) by 2 (male vs. female profile) design to investigate the relationship between amount of information displayed in a Social Network Site (SNS) profile and profile viewersโ intentions to engage in further social interactions (communicate online, add to SNS profile, and meet face-to-face) with the profile owner. The results indicate that more information increases the likelihood of relationship initiation for male profiles but decreases it for female profiles. Also, viewers are inclined to initiate an interaction when less information is presented in an SNS profile of a person from the opposite sex; but require more information from their own sex.
On the Study of Social Interactions in Twitter
Macskassy, Sofus A. (University of Southern California)
Twitter and other social media platforms are increasingly used as the primary way in which people speak with each other. As opposed to other platforms, Twitter is interesting in that many of these dialogues are public and so we can get a view into the dynamics of dialogues and how they differ from other other tweet behaviors. We here analyze tweets gathered from 2400 twitter streams over a one month period. We study social interactions in three important dimensions: what are the salient user behaviors in terms of how often they have social interactions and how these interactions are spread among different people; what are the characteristics of the dialogues, or sets of tweets, that we can extract from these interactions, and what are the characteristics of the social network which emerges from considering these interactions? We find that roughly half of the users spend a fair amount of time interacting whereas 40% of users do not seem to have active interactions. We also find that the vast majority of active dialogues only involve two people despite the public nature of these tweets. We finally find that while the emerging social network does contain a giant component, the component clearly is a set of well-defined tight clusters which are loosely connected.
Filtering Noisy Web Data by Identifying and Leveraging Users' Contributions
In this paper we present several methods for collecting Web textual contents and filtering noisy data. We show that knowing which user publishes which contents can contribute to detecting noise. We begin by collecting data from two forums and from Twitter. For the forums, we extract the meaningful information from each discussion (texts of question and answers, IDs of users, date). For the Twitter dataset, we first detect tweets with very similar texts, which helps avoiding redundancy in further analysis. Also, this leads us to clusters of tweets that can be used in the same way as the forum discussions: they can be modeled by bipartite graphs. The analysis of nodes of the resulting graphs shows that network structure and content type (noisy or relevant) are not independent, so network studying can help in filtering noise.