Goto

Collaborating Authors

 Asia


Dimensions of Self-Expression in Facebook Status Updates

AAAI Conferences

We describe the dimensions along which Facebook users tend to express themselves via status updates using the semi-automated text analysis approach, the Meaning Extraction Method (MEM). First, we examined dimensions of self-expression in all status updates from a sample of four million Facebook users from four English-speaking countries (the United States, Canada, the United Kingdom, and Australia) in order to examine how these countries vary in their self-expressions. All four countries showed a basic three-component structure, indicating that the medium is a stronger influence than country characteristics or demographics on how people use Facebook status updates. In each country, people vary in terms of the extent to which they use Informal Speech, share Positive Events, and discuss School in their Facebook status updates. Together, these factors tell us how users differ in their self-expression, and thus illustrate meaningful use cases for the product: Talking about what’s going on tends to be positive, and people vary in terms of the extent to which their status updates are short, slangy emotional expressions and topics regarding school. The specific words that define these factors showed subtle differences across countries: The use of profanity indicates fewer school words (but only in Australia), whereas the UK shows greater use of slang terms (rather than profanity) when speaking informally. The MEM also identified English-language dialects as a meaningful dimension along which the countries varied. In sum, beyond simply indicating topicality of posts, this study provides insight into how status updates are used for self-expression. We discuss several theoretical frameworks that could produce these results, and more broadly discuss the generation of theoretical frameworks from wholly empirical data (such as naturalistic Internet speech) using the MEM.


Latent Set Models for Two-Mode Network Data

AAAI Conferences

Two-mode networks are a natural representation for many kinds of relational data. These networks are bipartite graphs consisting of two distinct sets ("modes") of entities. For example, one can model multiple recipient email data as a two-mode network of (a) individuals and (b) the emails that they send or receive. In this work we present a statistical model for two-mode network data which posits that individuals belong to latent sets and that the members of a particular set tend to co-appear. We show how to infer these latent sets from observed data using a Markov chain Monte Carlo inference algorithm. We apply the model to the Enron email corpus, using it to discover interpretable latent structure as well as evaluating its predictive accuracy on a missing data task. Extensions to the model are discussed that incorporate additional side information such as the email's sender or text content, further improving the accuracy of the model.


Modelling Action Cascades in Social Networks

AAAI Conferences

The central idea in designing various marketing strategies for online social networks is to identify the influencers in the network. The influential individuals induce ``word-of-mouth" effects in the network. These individuals are responsible for triggering long cascades of influence that convince their peers to perform a similar action (buying a product, for instance). Targeting these influentials usually leads to a vast spread of the information across the network. Hence it is important to identify such individuals in a network. One way to measure an individual's influencing capability on its peers is by its reach for a certain action. We formulate identifying the influencers in a network as a problem of predicting the average depth of cascades an individual can trigger. We first empirically identify factors that play crucial role in triggering long cascades. Based on the analysis, we build a model for predicting the cascades triggered by a user for an action. The model uses features like influencing capabilities of the user and their friends, influencing capabilities of the particular action and other user and network characteristics. Experiments show that the model effectively improves the predictions over several baselines.


Timing Tweets to Increase Effectiveness of Information Campaigns

AAAI Conferences

Microblogging websites such as Twitter are increasingly being used by businesses/campaigners for timely dissemination of information to their followers. The diffusion of a tweet depends on several factors: the activity of the follower nodes, the responsiveness of follower nodes to tweets from the source node, the out-degree of the follower nodes, the content of recent related tweets seen by the follower node, etc. Using such factors, in this paper, we propose a framework to measure the effectiveness of an information campaign over Twitter. We consider a positive as well as a negative metric to measure the impact of a tweet: while retweets are used to measure the positive impact, the lack of a timely response from an active follower node is taken as a potential negative impact. We investigate the scheduling of tweets to increase the net positive impact while keeping the net negative impact below a desired level. We propose and study several scheduling algorithms by casting the problem in a Markov Decision Process (MDP) framework. In order to compare our algorithms, we estimate the model parameters from tweet data collected using the Twitter API from an arbitrarily selected node and its 6837 followers over several months. For this dataset, we find that if successive tweets in the campaign are novel, then substantial gains over user activity based scheduling can be obtained by scheduling tweets in time slots where the ratio of the expected positive and negative metrics is high. We call this the MaxRatio policy and we show that it is optimal under certain conditions. In cases where we are not certain about the response of users to successive related tweets, we identify another algorithm (which we call MaxReach) as a robust alternative.


Political Polarization on Twitter

AAAI Conferences

In this study we investigate how social media shape the networked public sphere and facilitate communication between communities with different political orientations. We examine two networks of political communication on Twitter, comprised of more than 250,000 tweets from the six weeks leading up to the 2010 U.S. congressional midterm elections. Using a combination of network clustering algorithms and manually-annotated data we demonstrate that the network of political retweets exhibits a highly segregated partisan structure, with extremely limited connectivity between left- and right-leaning users. Surprisingly this is not the case for the user-to-user mention network, which is dominated by a single politically heterogeneous cluster of users in which ideologically-opposed individuals interact at a much higher rate compared to the network of retweets. To explain the distinct topologies of the retweet and mention networks we conjecture that politically motivated individuals provoke interaction by injecting partisan content into information streams whose primary audience consists of ideologically-opposed users. We conclude with statistical evidence in support of this hypothesis.


Exploring Millions of Footprints in Location Sharing Services

AAAI Conferences

Location sharing services (LSS) like Foursquare, Gowalla, and Facebook Places support hundreds of millions of user-driven footprints (i.e., "checkins"). Those global-scale footprints provide a unique opportunity to study the social and temporal characteristics of how people use these services and to model patterns of human mobility, which are significant factors for the design of future mobile+location-based services, traffic forecasting, urban planning, as well as epidemiological models of disease spread. In this paper, we investigate 22 million checkins across 220,000 users and report a quantitative assessment of human mobility patterns by analyzing the spatial, temporal, social, and textual aspects associated with these footprints. We find that: (i) LSS users follow the “Levy Flight” mobility pattern and adopt periodic behaviors; (ii) While geographic and economic constraints affect mobility patterns, so does individual social status; and (iii) Content and sentiment-based analysis of posts associated with checkins can provide a rich source of context for better understanding how users engage with these services.


Location3: How Users Share and Respond to Location-Based Data on Social

AAAI Conferences

In August 2010 Facebook launched Places, a location-based service that allows users to check into points of interest and share their physical whereabouts with friends. The friends who see these events in their News Feed can then respond to these check-ins by liking or commenting on them. These data consisting of the places people go and how their friends react to them are a rich, novel dataset. In this paper we first analyze this dataset to understand the factors that influence where users check in, including previous check-ins, similarity to other places, where their friends check in, time of day, and demographics. We show how these factors can be used to build a predictive model of where users will check in next. Then we analyze how users respond to their friends’ check-ins and which factors contribute to users liking or commenting on them. We show how this can be used to improve the ranking of check-in stories, ensuring that users see only the most relevant updates from their friends and ensuring that businesses derive maximum value from check-ins at their establishments. Finally, we construct a model to predict friendship based on check-in count and show that cocheck-ins has a statistically significant effect on friendship.


Trust Amongst Rogues? A Hypergraph Approach for Comparing Clandestine Trust Networks in MMOGs

AAAI Conferences

Gold farming and real money trade refer to a set of illicit practices in massively multiplayer online games (MMOGs) whereby players accumulate virtual resources to sell for “real world” money. Prior work has examined trade relationships formed by gold farmers but not the trust relationships which exist between members of these organizations. We adopt a hypergraph approach to model the multi-modal relationships of gold farmers granting other players permission to use and modify objects they own. We argue these permissions reflect underlying trust relationships which can be analyzed using network analysis methods. We compare farmers’ trust networks to the trust networks of both unidentified farmers and typical players. Our results demonstrate that gold farmers’ networks are different from trust networks of normal players whereby farmers trust highly-central non-farmer players but not each other. These findings have implications for augmenting detection methods and re-evaluating theories of clandestine behavior.


Informledge System: A Modified Knowledge Network with Autonomous Nodes using Multi-lateral Links

arXiv.org Artificial Intelligence

Research in the field of Artificial Intelligence is continually progressing to simulate the human knowledge into automated intelligent knowledge base, which can encode and retrieve knowledge efficiently along with the capability of being is consistent and scalable at all times. However, there is no system at hand that can match the diversified abilities of human knowledge base. In this position paper, we put forward a theoretical model of a different system that intends to integrate pieces of knowledge, Informledge System (ILS). ILS would encode the knowledge, by virtue of knowledge units linked across diversified domains. The proposed ILS comprises of autonomous knowledge units termed as Knowledge Network Node (KNN), which would help in efficient cross-linking of knowledge units to encode fresh knowledge. These links are reasoned and inferred by the Parser and Link Manager, which are part of KNN.


Knowledge Embedding and Retrieval Strategies in an Informledge System

arXiv.org Artificial Intelligence

Informledge System (ILS) is a knowledge network with autonomous nodes and intelligent links that integrate and structure the pieces of knowledge. In this paper, we put forward the strategies for knowledge embedding and retrieval in an ILS. ILS is a powerful knowledge network system dealing with logical storage and connectivity of information units to form knowledge using autonomous nodes and multi-lateral links. In ILS, the autonomous nodes known as Knowledge Network Nodes (KNN)s play vital roles which are not only used in storage, parsing and in forming the multi-lateral linkages between knowledge points but also in helping the realization of intelligent retrieval of linked information units in the form of knowledge. Knowledge built in to the ILS forms the shape of sphere. The intelligence incorporated into the links of a KNN helps in retrieving various knowledge threads from a specific set of KNNs. A developed entity of information realized through KNN forms in to the shape of a knowledge cone