Goto

Collaborating Authors

 Europe


Enhancing Event Descriptions through Twitter Mining

AAAI Conferences

We describe a simple IR approach for linking news about events, detected by an event extraction system, to messages from Twitter (tweets). In particular, we explore several methods for creating event-specific queries for Twitter and provide a quantitative and qualitative evaluation of the relevance and usefulness of the information obtained from the tweets. We showed that methods based on utilization of word co-occurrence clustering, domain-specific keywords and named entity recognition improve the performance with respect to a basic approach.


Filtering Noisy Web Data by Identifying and Leveraging Users' Contributions

AAAI Conferences

In this paper we present several methods for collecting Web textual contents and filtering noisy data. We show that knowing which user publishes which contents can contribute to detecting noise. We begin by collecting data from two forums and from Twitter. For the forums, we extract the meaningful information from each discussion (texts of question and answers, IDs of users, date). For the Twitter dataset, we first detect tweets with very similar texts, which helps avoiding redundancy in further analysis. Also, this leads us to clusters of tweets that can be used in the same way as the forum discussions: they can be modeled by bipartite graphs. The analysis of nodes of the resulting graphs shows that network structure and content type (noisy or relevant) are not independent, so network studying can help in filtering noise.


Social Media Is NOT that Bad! The Lexical Quality of Social Media

AAAI Conferences

There is a strong correlation between spelling errors and web text content quality. Using our lexical quality measure,  based in a small corpus of spelling errors, we present an estimation of the lexical quality of the main Social Media sites. This paper presents an updated and complete analysis of the lexical quality of Social Media written in English and Spanish, including how lexical quality changes in time.


Talk of the City: Our Tweets, Our Community Happiness

AAAI Conferences

The literature of urban sociology and that of psychology have separately established two relationships: the first has linked characteristics of a community to its residents’ well-being, the second has linked well-being of individuals to their use of words. No one has hitherto explored the potential transitive relationship - that between characteristics of a community and its residents' use of words. We test this relationship by performing three steps. We consider Twitter users in a variety of London census communities; extract the subject matter of their tweets using "topic models"; and study the relationship between topics and community socio-economic well-being. We find that certain topics are correlated (positively and negatively) with community deprivation. Users in more deprived community tweet about wedding parties, matters expressed in Spanish/Portuguese, and celebrity gossips. By contrast, those in less deprived communities tweet about vacations, professional use of social media, environmental issues, sports, and health issues. We finally show that monitoring the subject matter of tweets not only offers insights into community well-being, but it is also a reasonable way of predicting community deprivation scores.


Finding Influential Authors in Brand-Page Communities

AAAI Conferences

Enterprises are increasingly using social media forums to engage with their customer online- a phenomenon known as Social Customer Relation Management (Social CRM) . In this context, it is important for an enterprise to identify “influential authors” and engage with them on a priority basis. We present a study towards finding influential authors on Twitter forums where an implicit network based on user interactions is created and analyzed. Furthermore, author profile features and user interaction features are combined in a decision tree classification model for finding influential authors. A novel objective evaluation criterion is used for evaluating various features and modeling techniques. We compare our methods with other approaches that use either only the formal connections or only the author profile features and show a significant improvement in the classification accuracy over these baselines as well as over using Klout score.


Emotional Divergence Influences Information Spreading in Twitter

AAAI Conferences

We analyze data about the micro-blogging site Twitter using sentiment extraction techniques. From an information perspective, Twitter users are involved mostly in two processes: information creation and subsequent distribution (tweeting), and pure information distribution (retweeting), with pronounced preference to the first. However a rather substantial fraction of tweets are retweeted. Here, we address the role of the sentiment expressed in tweets for their potential aftermath. We find that although the overall sentiment (polarity) does not influence the probability of a tweet to be retweeted, a new measure called "emotional divergence" does have an impact. In general, tweets with high emotional diversity have a better chance of being retweeted, hence influencing the distribution of information.


A Sentiment-Aware Approach to Community Formation in Social Media

AAAI Conferences

Participating in a community exemplifies the aspect of sharing, networking and interacting in a social media system. There has been extensive work on characterising on-line communities by their contents and tags using topic modelling tools. However, the role of sentiment and mood has not been studied. Arguably, mood is an integral feature of a text, and becomes more significant in the context of social media: two communities might discuss precisely the same topics, yet within an entirely different atmosphere. Such sentiment-related distinctions are important for many kinds of analysis and applications, such as community recommendation. We present a novel approach to identification of latent hyper-groups in social communities based on users’ sentiment. The results show that a sentiment-based approach can yield useful insights into community formation and meta-communities, having potential applications in, for example, mental health—by targeting support or surveillance to communities with negative mood—or in marketing—by targeting customer communities having the same sentiment on similar topics.


Evolutionary Clustering and Analysis of User Behaviour in Online Forums

AAAI Conferences

In this paper we cluster and analyse temporal user behaviour in online communities. We adapt a simple unsupervised clustering algorithm to an evolutionary setting where we cluster users into prototypical behavioural roles based on features derived from their ego-centric reply-graphs. We then analyse changes in the role membership of the users over time, the change in role composition of forums over time and examine the differences between forums in terms of role composition. We perform this analysis on 200 forums from a popular national bulletin board and 14 enterprise technical support forums.


More of a Receiver Than a Giver: Why Do People Unfollow in Twitter?

AAAI Conferences

We propose a logistic regression model taking into account two analytically different sets of factors–structure and action. The factors include individual, dyadic, and triadic properties between ego and alter whose tie breakup is under consideration. From the fitted model using a large-scale data, we discover 5 structural and 7 actional variables to have significant explanatory power for unfollow. One unique finding from our quantitative analysis is that people appreciate receiving acknowledgements from others even in virtually unilateral communication relationships and are less likely to unfollow them: people are more of a receiver than a giver.


Tweetin' in the Rain: Exploring Societal-Scale Effects of Weather on Mood

AAAI Conferences

There has been significant recent interest in using the aggregate sentiment from social media sites to understand and predict real-world phenomena. However, the data from social media sites also offers a unique and — so far — unexplored opportunity to study the impact of external factors on aggregate sentiment, at the scale of a society. Using a Twitter-specific sentiment extraction methodology, we the explore patterns of sentiment present in a corpus of over 1.5 billion tweets. We focus primarily on the effect of the weather and time on aggregate sentiment, evaluating how clearly the well-known individual patterns translate into population-wide patterns. Using machine learning techniques on the Twitter corpus correlated with the weather at the time and location of the tweets, we find that aggregate sentiment follows distinct climate, temporal, and seasonal patterns.