Universitat Pompeu Fabra
Online Petitioning Through Data Exploration and What We Found There: A Dataset of Petitions from Avaaz.org
Aragón, Pablo (Universitat Pompeu Fabra Eurecat) | Sáez-Trumper, Diego (Universitat Pompeu Fabra) | Redi, Miriam (Wikimedia Foundation) | Hale, Scott (University of Oxford) | Gómez, Vicenç (Universitat Pompeu Fabra) | Kaltenbrunner, Andreas (Universitat Pompeu Fabra)
The Internet has become a fundamental resource for activism as it facilitates political mobilization at a global scale. Petition platforms are a clear example of how thousands of people around the world can contribute to social change. Avaaz.org, with a presence in over 200 countries, is one of the most popular of this type. However, little research has focused on this platform, probably due to a lack of available data. In this work we retrieved more than 350K petitions, standardized their field values, and added new information using language detection and named-entity recognition. To motivate future research with this unique repository of global protest, we present a first exploration of the dataset. In particular, we examine how social media campaigning is related to the success of petitions, as well as some geographic and linguistic findings about the worldwide community of Avaaz.org. We conclude with example research questions that could be addressed with our dataset.
Planning With Pixels in (Almost) Real Time
Bandres, Wilmer (Universitat Pompeu Fabra) | Bonet, Blai (Universidad Sim ó n Bolívar) | Geffner, Hector (ICREA & Universitat Pompeu Fabra)
Recently, width-based planning methods have been shown to yield state-of-the-art results in the Atari 2600 video games. For this, the states were associated with the (RAM) memory states of the simulator. In this work, we consider the same planning problem but using the screen instead. By using the same visual inputs, the planning results can be compared with those of humans and learning methods. We show that the planning approach, out of the box and without training, results in scores that compare well with those obtained by humans and learning methods, and moreover, by developing an episodic, rollout version of the IW(k) algorithm, we show that such scores can be obtained in almost real time.
To Thread or Not to Thread: The Impact of Conversation Threading on Online Discussion
Aragón, Pablo (Universitat Pompeu Fabra) | Gómez, Vicenç (Universitat Pompeu Fabra) | Kaltenbrunner, Andreaks (Universitat Pompeu Fabra)
Online discussion is essential for the communication and collaboration of online communities. The reciprocal exchange of messages between users that characterizes online discussion can be represented in many different ways. While some platforms display messages chronologically using a simple linear interface, others use a hierarchical (threaded) interface to represent more explicitly the structure of the discussion. Although the type of representation has been shown to affect communication, to the best of our knowledge, the impact of using either one or the other has not yet been investigated in a large and mature online community. In this work we analyze Meneame, a popular Spanish social news platform which recently transitioned from a linear to a hierarchical interface, becoming an ideal research opportunity for this purpose. Using interrupted time series analysis and regression discontinuity design, we observe an abrupt and significant increase in social reciprocity after the adoption of a threaded interface. We furthermore extend state-of-the-art generative models of discussion threads by including reciprocity, a fundamental feature to explain better the structure of the discussions, both before and after the change in the interface.
Hierarchical Linearly-Solvable Markov Decision Problems
Jonsson, Anders (Universitat Pompeu Fabra) | Gómez, Vicenç (Universitat Pompeu Fabra)
We present a hierarchical reinforcement learning framework that formulates each task in the hierarchy as a special type of Markov decision process for which the Bellman equation is linear and has analytical solution. Problems of this type, called linearly-solvable MDPs (LMDPs) have interesting properties that can be exploited in a hierarchical setting, such as efficient learning of the optimal value function or task compositionality. The proposed hierarchical approach can also be seen as a novel alternative to solving LMDPs with large state spaces. We derive a hierarchical version of the so-called Z-learning algorithm that learns different tasks simultaneously and show empirically that it significantly outperforms the state-of-the-art learning methods in two classical HRL domains: the taxi domain and an autonomous guided vehicle task.
Real-Time Stochastic Optimal Control for Multi-Agent Quadrotor Systems
Gómez, Vicenç (Universitat Pompeu Fabra) | Thijssen, Sep (Radboud University) | Symington, Andrew (University of California Los Angeles) | Hailes, Stephen (University College London) | Kappen, Hilbert J (Radboud University Nijmegen)
This paper presents a novel method for controlling teams of unmanned aerial vehicles using Stochastic Optimal Control (SOC) theory. The approach consists of a centralized high-level planner that computes optimal state trajectories as velocity sequences, and a platform-specific low-level controller which ensures that these velocity sequences are met. The planning task is expressed as a centralized path-integral control problem, for which optimal control computation corresponds to a probabilistic inference problem that can be solved by efficient sampling methods. Through simulation we show that our SOC approach (a) has significant benefits compared to deterministic control and other SOC methods in multimodal problems with noise-dependent optimal solutions, (b) is capable of controlling a large number of platforms in real-time, and (c) yields collective emergent behaviour in the form of flight formations. Finally, we show that our approach works for real platforms, by controlling a team of three quadrotors in outdoor conditions.
ExTaSem! Extending, Taxonomizing and Semantifying Domain Terminologies
Espinosa-Anke, Luis (Universitat Pompeu Fabra) | Saggion, Horacio (Universitat Pompeu Fabra) | Ronzano, Francesco (Universitat Pompeu Fabra) | Navigli, Roberto (Sapienza University of Rome)
We introduce ExTaSem!, a novel approach for the automatic learning of lexical taxonomies from domain terminologies. First, we exploit a very large semantic network to collect housands of in-domain textual definitions. Second, we extract (hyponym, hypernym) pairs from each definition with a CRF-based algorithm trained on manually-validated data. Finally, we introduce a graph induction procedure which constructs a full-fledged taxonomy where each edge is weighted according to its domain pertinence. ExTaSem! achieves state-of-the-art results in the following taxonomy evaluation experiments: (1) Hypernym discovery, (2) Reconstructing gold standard taxonomies, and (3) Taxonomy quality according to structural measures. We release weighted taxonomies for six domains for the use and scrutiny of the community.
Do We Criticise (and Laugh) in the Same Way? Automatic Detection of Multi-Lingual Satirical News in Twitter
Barbieri, Francesco (Universitat Pompeu Fabra) | Ronzano, Francesco (Universitat Pompeu Fabra) | Saggion, Horacio (Universitat Pompeu Fabra)
During the last few years, the investigation of methodologies to automatically detect and characterise the figurative traits of textual contents has attracted a growing interest. Indeed, the capability to correctly deal with figurative language and more specifically with satire is fundamental to build robust approaches in several sub-fields of Artificial Intelligence including Sentiment Analysis and Affective Computing. In this paper we investigate the automatic detection of Tweets that advertise satirical news in English, Spanish and Italian. To this purpose we present a system that models Tweets from different languages by a set of language independent features that describe lexical, semantic and usage-related properties of the words of each Tweet. We approach the satire identification problem as binary classification of Tweets as satirical or not satirical messages. We test the performance of our system by performing experiments of both monolingual and cross-language classifications, evaluating the satire detection effectiveness of our features.Our system outperforms a word-based baseline and it is able to recognise if a news in Twitter is satirical or not with good accuracy. Moreover, we analyse the behaviour of the system across the different languages, obtaining interesting results.
Robust Winners and Winner Determination Policies under Candidate Uncertainty
Boutilier, Craig (University of Toronto) | Lang, Jérôme (Université Paris-Dauphine) | Oren, Joel (University of Toronto) | Palacios, Héctor (Universitat Pompeu Fabra)
We consider voting situations in which some candidates may turn out to be unavailable. When determining availability is costly (e.g., in terms of money, time, or computation), voting prior to determining candidate availability and testing the winner's availability after the vote may be beneficial. However, since few voting rules are robust to candidate deletion, winner determination requires a number of such availability tests. We outline a model for analyzing such problems, defining robust winners relative to potential candidate unavailability. We assess the complexity of computing robust winners for several voting rules. Assuming a distribution over availability, and costs for availability tests/queries, we describe algorithms for computing optimal query policies, which minimize the expected cost of determining true winners.
Safe, Strong, and Tractable Relevance Analysis for Planning
Haslum, Patrik (Australian National University) | Helmert, Malte (University of Basel) | Jonsson, Anders (Universitat Pompeu Fabra)
In large and complex planning problems, there will almost inevitably be aspects that are not relevant to a specific problem instance. Thus, identifying and removing irrelevant parts from an instance is one of the most important techniques for scaling up automated planning. We examine the path-based relevance analysis method, which is safe (preserves plan existence and cost) and powerful but has exponential time complexity, and show how to make it run in polynomial time with only a minimal loss of pruning power.
Social Media Is NOT that Bad! The Lexical Quality of Social Media
Rello, Luz (Universitat Pompeu Fabra) | Baeza-Yates, Ricardo (Yahoo! Research)
There is a strong correlation between spelling errors and web text content quality. Using our lexical quality measure, based in a small corpus of spelling errors, we present an estimation of the lexical quality of the main Social Media sites. This paper presents an updated and complete analysis of the lexical quality of Social Media written in English and Spanish, including how lexical quality changes in time.