Goto

Collaborating Authors

 Europe


FoodMood: Measuring Global Food Sentiment One Tweet at a Time

AAAI Conferences

Do Happy Meals really make us happy? Do salads make us blue? Is cake our comfort? FoodMood is an interactive data visualisation project that gives citizens a rare opportunity to engage and reflect, acknowledge, and understand the connection between emotion, obesity and food. The project explores the opportunities presented by the data-sharing world of todayโ€™s cities using global English-language tweets about food coupled with sentiment analysis. It aims to gain a better understanding of global food consumption patterns and its impact on the daily emotional well-being of people against the backdrop of country data such as Gross Domestic Product (GDP) and obesity levels. A key finding is that tweets can be used to find a relationship between certain foods, food sentiment and obesity levels in countries. Overall FoodMood shows a majority positive sentiment towards food. Other findings, although constantly evolving, indicate trends such as: globally meat enjoys a high sentiment rating and is often tweeted about; fast-food companies dominate the food consumption landscapes of most countriesโ€™ tweets although not all of them enjoy equal sentiment ratings across countries. Ultimately, FoodMood reveals a hidden layer of meaningful digital, social, and cultural data that provide a basis for further analysis.


Trendminer: An Architecture for Real Time Analysis of Social Media Text

AAAI Conferences

The emergence of online social networks (OSNs) and the accompanying availability of large amounts of data, pose a number of new natural language processing (NLP) and computational challenges. Data from OSNs is different to data from traditional sources (e.g. newswire). The texts are short, noisy and conversational. Another important issue is that data occurs in a real-time streams, needing immediate analysis that is grounded in time and context. In this paper we describe a new open-source framework for efficient text processing of streaming OSN data (available at www.trendminer-project.eu). Whilst researchers have made progress in adapting or creating text analysis tools for OSN data, a system to unify these tasks has yet to be built. Our system is focused on a real world scenario where fast processing and accuracy is paramount. We use the MapReduce framework for distributed computing and present running times for our system in order to show that scaling to online scenarios is feasible.We describe the components of the system and evaluate their accuracy. Our system supports easy integration of future modules in order to extend its functionality.


Unsupervised Real-Time Company Name Disambiguation in Twitter

AAAI Conferences

This paper presents a new approach to disambiguate company names in the Twitter social network. We have focused on making lighter the processing of comparing company profiles with tweets in order to obtain a competitive real-time system. With this aim, we only use the home page of each company as information source to create a unique profile. On the other hand, we compute the similarity of a tweet in connection to a profile by comparing the content of the tweet with the profile. Both steps do not use any other external information source and all the process is developed in an unsupervised way. We have tested our application with the test WePS-3 CLEF ORM corpus obtaining encouraging results.


Using Complex Event Processing for Modeling Semantic Requests in Real-Time Social Media Monitoring

AAAI Conferences

Social media analytics has been attracting considerable attention in both research and industry due to the increasing popularity of social media usage. As a subset, social media monitoring describes the process of continuous monitoring of a subject matter in social media. From our point of view, the key requirements for such systems are i) high throughput and real-time processing of incoming data, ii) a user-friendly way to define complex situations of interests that make use of formalized background knowledge and iii) capabilities to perform actions based on gained insights instead of a pure monitoring system. In this paper, we propose a system for (pro) active, real-time social media monitoring. Firstly, we describe the conceptual architecture of our system and necessary pre-processing steps. Secondly, we introduce our concept of semantic requests that is capable to extend event pattern definitions with background knowledge. Finally, we show the usefulness of this system in two different domains: Real-time political opinion tracking and proactive establishment of relationships with consumers in order to perform a new form of real-time marketing. The main advantage of our approach is a simplified, expressive way to formulate event patterns in social media applications.


A Systematic Investigation of Blocking Strategies for Real-Time Classification of Social Media Content into Events

AAAI Conferences

Events play a prominent role in our lives, such that many social media documents describe or are related to some event. Organizing social media documents with respect to events thus seems a promising approach to better manage and organize the ever-increasing amount of user-generated content in social media applications. It would support the navigation of data by events or allow one to get notified about new postings related to the events one is interested in, just to name two applications. A challenge is to automatize this process so that incoming documents can be assigned to their corresponding event without any user intervention. We present a system that is able to classify a stream of social media data into a growing and evolving set of events. In order to scale up to the data sizes and data rates in social media applications, the use of a candidate retrieval or blocking step is crucial to reduce the number of events that are considered as potential candidates to which the incoming data point could belong to.In this paper we present and experimentally compare different blocking strategies along their cost vs. effectiveness tradeoff.We show that using a blocking strategy that selects the 60 closest events with respect to upload time, we reach F-Measures of about 85.1% while being able to process the incoming documents within 32ms on average. We thus provide a principled approach supporting to scale up classification of social media documents into events and to process the incoming stream of documents in real time.


SMILE: An Informality Classification Tool for Helping to Assess Quality and Credibility in Web 2.0 Texts

AAAI Conferences

The data made available by Web 2.0 applications such as social networks, on-line chats or blogs have give access to multiples sources of information. Due to this dramatic increase in available information, the perception of quality and credibility plays an important role in social media, thus making necessary to discard low quality and uninteresting content. Moreover, the informal features of Web 2.0 texts such as emoticons, typos, slang or loss of formatting impact negatively on user perception regarding content quality and credibility. For this reason, this paper proposes the SMILE system, a novel unsupervised real-time tool for assessing user-generated content quality and credibility using informality levels. As a test case, we focus on Yahoo! Answers, a relevant Web 2.0 application by its amount of users, content and textual diversity. The results of our study show that informality analysis can be used as criteria to help assess the credibility and quality of Web 2.0 information sources.


Frankenplace: An Application for Similarity-Based Place Search

AAAI Conferences

When experiencing or describing a new place people will often compare it against other places that they already know. However, this human attention to the simultaneous similarities and differences between places is not reflected in the design of user interfaces of current place search technologies. In this demo, we present Frankenplace, an application for doing similarity-based place search that allows users to interactively find new places based on mixtures of features drawn from different places. The features of places are derived from a combination of authoritative data sources and unstructured observation data from social media, and organized into an extensible set of layers. We demonstrate the Frankenplace interface, which lets a user build a profile of a target place by selecting the most relevant of the properties shared by known places.


A Supervised Approach to Predict Company Acquisition with Factual and Topic Features Using Profiles and News Articles on TechCrunch

AAAI Conferences

Merger and Acquisition (M&A) prediction has been an interesting and challenging research topic in the past a few decades. However, past work has only adopted numerical features in building models, and yet the valuable textual information from the great variety of social media sites has not been touched at all. To fully explore this information, we used the profiles and news articles for companies and people on TechCrunch, the leading and largest public database for the tech world, which anybody can edit. Specifically, we explored topic features via topic modeling techniques, as well as a set of other novel features of our design within a machine learning framework. We conducted experiments of the largest scale in the literature, and achieved a high true positive rate (TP) between 60% to 79.8% with a false positive rate (FP) mostly between 0% and 8.3% over company categories with a small number of missing attributes in the CrunchBase profiles.


Mixed Membership Models for Exploring User Roles in Online Fora

AAAI Conferences

Discussion boards are a form of social media which allow users to discuss topics and exchange information in a complex manner, in a number of different settings. As the popularity of such message boards has increased, communities of users have emerged, and several prominent types of social role have been identified, such as Question Answerer, Celebrity, Discussion Person and Topic Initiator. Recent studies have noted the structural similarity of the egocentric network of users assigned the same role by qualitative criteria. In this paper a methodology is developed with which to cluster together users with similar ego-centric network structures. This is achieved using a mixed membership formulation which allows for the fact that different groups of users may have characteristics in common. The method is then applied to data taken from boards.ie, a medium sized message boards website. Prominent clusters of users are identified and discussed, and illustrative examples of user behaviour provided. The type of interaction, both locally and globally, taking place within forums is examined.


What Catches Your Attention? An Empirical Study of Attention Patterns in Community Forums

AAAI Conferences

Online community managers work towards building and managing communities around a given brand or topic. A risk imposed on such managers is that their community may die out and its utility diminish to users. Understanding what drives attention to content and the dynamics of discussions in a given community informs the community manager and/or host with the factors that are associated with attention. In this paper we gain insights into the idiosyncrasies that individual community forums exhibit in their attention patterns and how the factors that impact activity differ. We glean such insights by using logistic regression models for identifying seed posts and explore the effectiveness of a range of features. Our findings show that the discussion behaviour of different communities is clearly impacted by different factors.