Government
Information Markets for Social Participation in Public Policy Design and Implementation
Mentzas, Gregoris (National Technical University of Athens) | Apostolou, Dimitris (University of Piraeus) | Bothos, Efthimios (National Technical University of Athens) | Magoutas, Babis (National Technical University of Athens)
In this paper we propose a research agenda on the use of information markets as tools to collect, aggregate and analyze citizens’ opinions, expectations and preferences from social media in order to support public policy design and implementation. We argue that markets are institutional settings able to efficiently allocate scarce resources, aggregate and disseminate information into prices and accommodate hedging against various types of risks. We discuss various types of information markets, as well as address the participation of both human and computational agents in such markets.
Towards Discovery of Influence and Personality Traits through Social Link Prediction
Nguyen, Thin (Curtin University of Technology) | Phung, Dinh (Curtin University of Technology) | Adams, Brett (Curtin University of Technology) | Venkatesh, Svetha (Curtin University of Technology)
Estimation of a person's influence and personality traits from social media data has many applications. We use social linkage criteria, such as number of followers and friends, as proxies to form corpora, from popular blogging site Livejournal, for examining two two-class classification problems: influential vs. non-influential, and extraversion vs. introversion. Classification is performed using automatically-derived psycholinguistic and mood-based features of a user's textual messages. We experiment with three sub-corpora of 10000 users each, and present the most effective predictors for each category. The best classification result, at 80%, is achieved using psycholinguistic features; e.g., influentials are found to use more complex language, than non-influentials, and use more leisure-related terms.
Limits of Electoral Predictions Using Twitter
Gayo-Avello, Daniel (Universidad de Oviedo) | Metaxas, Panagiotis Takis (Wellesley College) | Mustafaraj, Eni (Wellesley College)
Using social media for political discourse is becoming common practice, especially around election time. One interesting aspect of this trend is the possibility of pulsing the public’s opinion about the elections, and that has attracted the interest of many researchers and the press. Allegedly, predicting electoral outcomes from social media data can be feasible and even simple. Positive results have been reported, but without an analysis on what principle enables them. Our work puts to test the purported predictive power of socialmedia metrics against the 2010 US congressional elections. Here, we applied techniques that had reportedly led to positive election predictions in the past, on the Twitter data collected from the 2010 US congressional elections. Unfortunately, we find no correlation between the analysis results and the electoral outcomes, contradicting previous reports. Observing that 80 years of polling research would support our findings, we argue that one should not be accepting predictions about events using social media data as a black box. Instead, scholarly research should be accompanied by a model explaining the predictive power of social media, when there is one.
Analyzing Political Trends in the Blogosphere
Demartini, Gianluca (L3S Research Center) | Siersdorfer, Stefan (L3S Research Center) | Chelaru, Sergiu (L3S Research Center) | Nejdl, Wolfgang (L3S Research Center)
In the last years, the blogosphere has become a vital part of the web, covering a variety of different points of view and opinions on political and event-related topics such as immigration, election campaigns, or economic developments. Tracking the public opinion is usually done by conducting surveys resulting in significant costs both for interviewers and persons consulted. In this paper, we propose a method for extracting political trends in the blogosphere.To this end, we apply sentiment and time series analysis techniques in combination with aggregation methods on blog data to estimate the temporal development of opinions on politicians.
Prominence Ranking in Graphs with Community Structure
Adali, Sibel (Rensselaer Polytechnic Institute) | Lu, Xiaohui (Rensselaer Polytechnic Institute) | Magdon-Ismail, Malik (Rensselaer Polytechnic Institute) | Purnell, Jonathan (Rensselaer Polytechnic Institute)
We consider prominence ranking in graphs involving actors, their artifacts and the artifact groups. When multiple actors contributing to an artifact constitutes a social tie, associations between the artifacts can be used to infer prominence among actors. This is because prominent actors will tend to collaborate on prominent artifacts, and prominent artifacts will be associated with other prominent artifacts. Our testbed example is the DBLP co-authorship graph: multiple authors (the actors) collaborate to publish research papers (the artifacts); collaboration is the social tie. Papers have prominence themselves (eg. quality and impact of the work) and the prominence of the venues are tied to the prominence of the papers in them. We use our methods to infer prominence based on the venue-based associations of papers, and compare our rankings with external citation based measures of prominence. We compare with numerous other ranking algorithms, and show that the ranking performance gain from using the venues is statistically significant. What if there are no natural artifact groups like venues? We develop a new algorithm which uses discovered artifact groups. Our approach consists of two steps. First, we find artifact groups by linking artifacts with common contributors. Note that instead of finding communities of actors, we consider communities of artifacts. We then use these grouped artifacts in the prominence ranking algorithm. We consider different methods for obtaining the artifact groups, in particular a very efficient embedding based algorithm for graph clustering and show the effectiveness of our method in improving the ranking of actors. The inferred groups are as good as or better than the natural conference venues for DBLP.
Classifying the Political Leaning of News Articles and Users from User Votes
Zhou, Daniel Xiaodan (University of Michigan) | Resnick, Paul (University of Michigan) | Mei, Qiaozhu (University of Michigan)
Social news aggregator services generate readers’ subjective reactions to news opinion articles. Can we use those as a resource to classify articles as liberal or conservative, even without knowing the self-identified political leaning of most users? We applied three semi-supervised learning methods that propagate classifications of political news articles and users as conservative or liberal, based on the assumption that liberal users will vote for liberal articles more often, and similarly for conservative users and articles. Starting from a few labeled articles and users, the algorithms propagate political leaning labels to the entire graph. In cross-validation, the best algorithm achieved 99.6% accuracy on held-out users and 96.3% accuracy on held-out articles. Adding social data such as users’ friendship or text features such as cosine similarity did not improve accuracy. The propagation algorithms, using the subjective liking data from users, also performed better than an SVM based text classifier, which achieved 92.0% accuracy on articles.
Natural Language Processing to the Rescue? Extracting "Situational Awareness" Tweets During Mass Emergency
Verma, Sudha (University of Colorado) | Vieweg, Sarah (University of Colorado) | Corvey, William J. (University of Colorado) | Palen, Leysia (University of Colorado) | Martin, James H. (University of Colorado) | Palmer, Martha (University of Colorado) | Schram, Aaron (University of Colorado) | Anderson, Kenneth M. (University of Colorado)
In times of mass emergency, vast amounts of data are generated via computer-mediated communication (CMC) that are difficult to manually cull and organize into a coherent picture. Yet valuable information is broadcast, and can provide useful insight into time- and safety-critical situations if captured and analyzed properly and rapidly. We describe an approach for automatically identifying messages communicated via Twitter that contribute to situational awareness, and explain why it is beneficial for those seeking information during mass emergencies. We collected Twitter messages from four different crisis events of varying nature and magnitude and built a classifier to automatically detect messages that may contribute to situational awareness, utilizing a combination of hand-annotated and automatically-extracted linguistic features. Our system was able to achieve over 80% accuracy on categorizing tweets that contribute to situational awareness. Additionally, we show that a classifier developed for a specific emergency event performs well on similar events. The results are promising, and have the potential to aid the general public in culling and analyzing information communicated during times of mass emergency.
Memes Online: Extracted, Subtracted, Injected, and Recollected
Simmons, Matthew P. (University of Michigan) | Adamic, Lada A. (Universiry of Michigan) | Adar, Eytan (University of Michigan)
Social media is playing an increasingly vital role in information dissemination. But with dissemination being more distributed, content often makes multiple hops, and consequently has opportunity to change. In this paper we focus on content that should be changing the least, namely quoted text. We find changes to be frequent, with their likelihood depending on the authority of the copied source and the type of site that is copying. We uncover patterns in the rate of appearance of new variants, their length, and popularity, and develop a simple model that is able to capture them. These patterns are distinct from ones produced when all copies are made from the same source, suggesting that information is evolving as it is being processed collectively in online social media.
Detecting and Tracking Political Abuse in Social Media
Ratkiewicz, Jacob (Indiana University) | Conover, Michael D. (Indiana University) | Meiss, Mark (Indiana University) | Goncalves, Bruno (Indiana University) | Flammini, Alessandro (Indiana University) | Menczer, Filippo Menczer (Indiana University)
We study astroturf political campaigns on microblogging platforms: politically-motivated individuals and organizations that use multiple centrally-controlled accounts to create the appearance of widespread support for a candidate or opinion. We describe a machine learning framework that combines topological, content-based and crowdsourced features of information diffusion networks on Twitter to detect the early stages of viral spreading of political misinformation. We present promising preliminary results with better than 96% accuracy in the detection of astroturf content in the run-up to the 2010 U.S. midterm elections.
A Machine Learning Approach to Twitter User Classification
Pennacchiotti, Marco (Yahoo! Labs) | Popescu, Ana-Maria (Yahoo! Labs)
This paper addresses the task of user classification in social media, with an application to Twitter. We automatically infer the values of user attributes such as political orientation or ethnicity by leveraging observable information such as the user behavior, network structure and the linguistic content of the user’s Twitter feed. We employ a machine learning approach which relies on a comprehensive set of features derived from such user information. We report encouraging experimental results on 3 tasks with different characteristics: political affiliation detection, ethnicity identification and detecting affinity for a particular business. Finally, our analysis shows that rich linguistic features prove consistently valuable across the 3 tasks and show great promise for additional user classification needs.