Goto

Collaborating Authors

 Country


Homophily and Latent Attribute Inference: Inferring Latent Attributes of Twitter Users from Neighbors

AAAI Conferences

In this paper, we extend existing work on latent attribute inference by leveraging the principle of homophily: we evaluate the inference accuracy gained by augmenting the user features with features derived from the Twitter profiles and postings of her friends. We consider three attributes which have varying degrees of assortativity: gender, age, and political affiliation. Our approach yields a significant and robust increase in accuracy for both age and political affiliation, indicating that our approach boosts performance for attributes with moderate to high assortativity. Furthermore, different neighborhood subsets yielded optimal performance for different attributes, suggesting that different subsamples of the user's neighborhood characterize different aspects of the user herself. Finally, inferences using only the features of a user's neighbors outperformed those based on the user's features alone. This suggests that the neighborhood context alone carries substantial information about the user.


You Too?! Mixed-Initiative LDA Story Matching to Help Teens in Distress

AAAI Conferences

Adolescent cyber-bullying on social networks is a phenomenon that has received widespread attention. Recent work by sociologists has examined this phenomenon under the larger context of teenage drama and it's manifestations on social networks. Tackling cyber-bullying involves two key components โ€“ automatic detection of possible cases, and interaction strategies that encourage reflection and emotional support. Key is showing distressed teenagers that they are not alone in their plight. Conventional topic spotting and document classification into labels like "dating" or "sports" are not enough to effectively match stories for this task. In this work, we examine a corpus of 5500 stories from distressed teenagers from a major youth social network. We combine Latent Dirichlet Allocation and human interpretation of its output using principles from sociolinguistics to extract high-level themes in the stories and use them to match new stories to similar ones. A user evaluation of the story matching shows that theme-based retrieval does a better job of finding relevant and effective stories for this application than conventional approaches.


Identifying Microblogs for Targeted Contextual Advertising

AAAI Conferences

Micro-blogging sites such as Facebook, Twitter, Google+ present a nice opportunity for targeting advertisements that are contextually related to the microblog content. By virtue of the sparse and noisy text makes identifying the microblogs suitable for advertising a very hard problem. In this work, we approach the problem of identifying the microblogs that could be targeted for advertisements as a two-step classification approach. In the first pass, microblogs suitable for advertising are identified. Next, in the second pass, we build a model to find the sentiment of the advertisable microblog. The systems use features derived from the Part-of-speech tags, the tweet content and uses external resources such as query logs and n-gram dictionaries from previously labeled data.This work aims at providing a thorough insight into the problem and analyzing various features to assess which features contribute the most towards identifying the tweets that can be targeted for advertisements.


Finding Influential Authors in Brand-Page Communities

AAAI Conferences

Enterprises are increasingly using social media forums to engage with their customer online- a phenomenon known as Social Customer Relation Management (Social CRM) . In this context, it is important for an enterprise to identify โ€œinfluential authorsโ€ and engage with them on a priority basis. We present a study towards finding influential authors on Twitter forums where an implicit network based on user interactions is created and analyzed. Furthermore, author profile features and user interaction features are combined in a decision tree classification model for finding influential authors. A novel objective evaluation criterion is used for evaluating various features and modeling techniques. We compare our methods with other approaches that use either only the formal connections or only the author profile features and show a significant improvement in the classification accuracy over these baselines as well as over using Klout score.


OurCity: Understanding How Visualization and Aggregation of User-Generated Content Can Engage Citizens in Community Participation

AAAI Conferences

OurCity is a site-specific digital artwork designed to solicit, aggregate and visualize citizensโ€™ views on the cities in which they live. It aims to allow people to have their voice heard in a way which is fun and engaging and reduces the gap between citizens and policymakers. OurCity builds on our previous work, VoiceYourView (Whittle et al 2010) which used similar data aggregation techniques but a completely different visualization of user-generated data. This paper revisits the key results from VoiceYourView and hence uses OurCity as an additional validation exercise to assess whether VoiceYourView results are generalizable.


FoodMood: Measuring Global Food Sentiment One Tweet at a Time

AAAI Conferences

Do Happy Meals really make us happy? Do salads make us blue? Is cake our comfort? FoodMood is an interactive data visualisation project that gives citizens a rare opportunity to engage and reflect, acknowledge, and understand the connection between emotion, obesity and food. The project explores the opportunities presented by the data-sharing world of todayโ€™s cities using global English-language tweets about food coupled with sentiment analysis. It aims to gain a better understanding of global food consumption patterns and its impact on the daily emotional well-being of people against the backdrop of country data such as Gross Domestic Product (GDP) and obesity levels. A key finding is that tweets can be used to find a relationship between certain foods, food sentiment and obesity levels in countries. Overall FoodMood shows a majority positive sentiment towards food. Other findings, although constantly evolving, indicate trends such as: globally meat enjoys a high sentiment rating and is often tweeted about; fast-food companies dominate the food consumption landscapes of most countriesโ€™ tweets although not all of them enjoy equal sentiment ratings across countries. Ultimately, FoodMood reveals a hidden layer of meaningful digital, social, and cultural data that provide a basis for further analysis.


Automatic Versus Human Navigation in Information Networks

AAAI Conferences

People regularly face tasks that can be understood as navigation in information networks, where the goal is to find a path between two given nodes. In many such situations, the navigator only gets local access to the node currently under inspection and its immediate neighbors. This lack of global information about the network notwithstanding, humans tend to be good at finding short paths, despite the fact that real-world networks are typically very large. One potential reason for this could be that humans possess vast amounts of background knowledge about the world, which they leverage to make good guesses about possible solutions. In this paper we ask the question: Are human-like high-level reasoning skills really necessary for finding short paths? To answer this question, we design a number of navigation agents without such skills, which use only simple numerical features. We evaluate the agents on the task of navigating Wikipedia, a domain for which we also possess large-scale human navigation data. We observe that the agents find shorter paths than humans on average and therefore conclude that, perhaps surprisingly, no sophisticated background knowledge or high-level reasoning is required for navigating the complex Wikipedia network.


Towards Analyzing Micro-Blogs for Detection and Classification of Real-Time Intentions

AAAI Conferences

Micro-blog forums, such as Twitter, constitute a powerful medium today that people use to express their thoughts and intentions on a daily, and in many cases, hourly, basis. Extracting โ€˜Real-Time Intentionโ€™ (RTI) of a user from such short text updates is a huge opportunity towards web personalization and social net- working around dynamic user context. In this paper, we explore the novel problem of detecting and classifying RTIs from micro-blogs. We find that employing a heuristic based ensemble approach on a reduced dimension of the feature space, based on a wide spectrum of linguistic and statistical features of RTI expressions, achieves significant improvement in detect- ing RTIs compared to word-level features used in many social media classification tasks today. Our solution approach takes into account various salient characteristics of micro-blogs towards such classification โ€“ high dimensionality, sparseness of data, limited context, grammatical in-correctness, etc.


Modeling Destructive Group Dynamics in On-Line Gaming Communities

AAAI Conferences

Social groups often exhibit a high degree of dynamism. Some groups thrive, while many others die over time. Modeling destructive dynamics and understanding whether/why/when a person will depart from a group can be important in a number of social domains. In this paper, we take the World of Warcraft game as an exemplar platform for studying destructive group dynamics. We build models to predict if and when an individual is going to quit his/her guild, and whether this quitting event will inflict substantial damage on the guild. Our predictors start from in-game census data and extract features from multiple perspectives such as individual-level, guild-level, game activity, and social interaction features. Our study shows that destructive group dynamics can often be predicted with modest to high accuracy, and feature diversity is critical to prediction performance.


Weblog Analysis for Predicting Correlations in Stock Price Evolutions

AAAI Conferences

We use data extracted from many weblogs to identify the underlying relations of a set of companies in the Standard and Poor (S\&P) 500 index. We define a pairwise similarity measure for the companies based on the weblog articles and then apply a graph clustering procedure. We show that it is possible to capture some interesting relations between companies using this method. As an application of this clustering procedure we propose a cluster-based portfolio selection method which combines information from the weblog data and historical stock prices. Through simulation experiments, we show that our method performs better (in terms of risk measures) than cluster-based portfolio strategies based on company sectors or historical stock prices. This suggests that the methodology has the potential to identify groups of companies whose stock prices are more likely to be correlated in the future.