Country
Exploring Text Virality in Social Networks
Guerini, Marco (Fondazione Bruno Kessler - IRST) | Strapparava, Carlo (Fondazione Bruno Kessler - IRST) | Ozbal, Gozde (Fondazione Bruno Kessler - IRST)
This paper aims to shed some light on the concept of virality - especially in social networks - and to provide new insights on its structure. We argue that: (a) virality is a phenomenon strictly connected to the nature of the content being spread, rather than to the influencers who spread it (b) virality is a phenomenon with many facets, i.e. under this generic term several different effects of persuasive communication are comprised and they only partially overlap. To give ground to our claims, we provide initial experiments in a machine learning framework to show how various aspects of virality can be independently predicted according to content features.
Characterizing Social Relations Via NLP-Based Sentiment Analysis
Groh, Georg (TU Muenchen) | Hauffa, Jan (TU Muenchen)
We investigate and evaluate methods for the characterization of social relations from textual communication context, using e-mail as an example. Social relations are intrinsically characterized by the Cartesian product of weights on various axes (we employ valuation and intensity as examples). The prediction of these characteristics is performed by application of unsupervised learning algorithms on meta-data, communication statistics, and the results of deep linguistic analysis of the message body. Classification of sentiment polarity is chosen as the means of linguistic analysis. We find that prediction accuracy can be improved by introducing limited amounts of additional information.
Automatically Identifying Groups Based on Content and Collective Behavioral Patterns of Group Members
Gregory, Michelle (Pacific Northwest National Laboratory) | Engel, Dave W. (Pacific Northwest National Laboratory) | Bell, Eric (Pacific Northwest National Laboratory) | Piatt, Andy (Pacific Northwest National Laboratory) | Dowson, Scott (Pacific Northwest National Laboratory) | Cowell, Andrew (Pacific Northwest National Laboratory)
For example, on Live Journal1, there are a number of categories, gaming, for The explosion of popularity in social media, such as internet example, that one can categorize themselves and their forums, weblogs (blogs), wikis, etc., in the past decade blogs. While a number of those that self select that category has created a new opportunity to measure public opinion, may interact, there is no explicit requirement to do so. If attitude, and social structures (Agichtein et al. 2008, one is interested in marketing to a gaming crowd, for instance, Qualman 2010). A very common social structure investigated knowing all persons interested in gaming would be is online communities, or groups. There are a number useful, even if they do not interact directly with one another.
Limits of Electoral Predictions Using Twitter
Gayo-Avello, Daniel (Universidad de Oviedo) | Metaxas, Panagiotis Takis (Wellesley College) | Mustafaraj, Eni (Wellesley College)
Using social media for political discourse is becoming common practice, especially around election time. One interesting aspect of this trend is the possibility of pulsing the publicโs opinion about the elections, and that has attracted the interest of many researchers and the press. Allegedly, predicting electoral outcomes from social media data can be feasible and even simple. Positive results have been reported, but without an analysis on what principle enables them. Our work puts to test the purported predictive power of socialmedia metrics against the 2010 US congressional elections. Here, we applied techniques that had reportedly led to positive election predictions in the past, on the Twitter data collected from the 2010 US congressional elections. Unfortunately, we find no correlation between the analysis results and the electoral outcomes, contradicting previous reports. Observing that 80 years of polling research would support our findings, we argue that one should not be accepting predictions about events using social media data as a black box. Instead, scholarly research should be accompanied by a model explaining the predictive power of social media, when there is one.
Large-Scale Community Detection on YouTube for Topic Discovery and Exploration
Gargi, Ullas (Google, Inc.) | Lu, Wenjun (University of Maryland) | Mirrokni, Vahab (Google, Inc.) | Yoon, Sangho (Google, Inc.)
Detecting coherent, well-connected communities in large graphs provides insight into the graph structure and can serve as the basis for content discovery. Clustering is a popular technique for community detection but global algorithms that examine the entire graph do not scale. Local algorithms are highly parallelizable but perform sub-optimally, especially in applications where we need to optimize multiple metrics. We present a multi-stage algorithm based on local-clustering that is highly scalable, combining a pre-processing stage, a lo- cal clustering stage, and a post-processing stage. We apply it to the YouTube video graph to generate named clusters of videos with coherent content. We formalize coverage, co- herence, and connectivity metrics and evaluate the quality of the algorithm for large YouTube graphs. Our use of local algorithms for global clustering, and its implementation and practical evaluation on such a large scale is a ๏ฌrst of its kind.
Creating Conversations: An Automated Dialog System
Gandy, Lisa (Northwestern University) | Hammond, Kristian (Northwestern University)
Online news sites often include a comments section where readers are allowed to leave their thoughts. These comments often contain interesting and insightful conversations between readers about the news article. However the richness of these conversations is often lost among other meaningless comments, and moreover all comments are found at the bottom of the web page. In this article, we discuss how our system inserts reader conversations into the news article to create a multimedia presentation called Shout Out. Shout Out features two virtual news anchors: one anchor reads the news and when appropriate the anchor pauses to have a conversation about the news with another anchor. This current iteration of Shout Out combines natural language techniques and reader conversations to create an engaging system.
Automatic Group-Interactive Radio Using Social-Networks of Musicians
Fields, Ben (University of London) | Rhodes, Christophe (University of London) | d' (University of London) | Inverno, Mark
Using request radio shows as a base interactive model, we present the Steerable Optimizing Self-Organized Radio (SoSoRadio) system as a prototypical music rec- ommender system with robust automatic playlist gen- eration. This work describes a web-based radio system that interacts with current listeners through the selection of periodic request songs from a pool of nominees.
Using the H-Index to Estimate Blog Authority
Devezas, Josรฉ (Labs SAPO/UP) | Nunes, Sรฉrgio (Instituto de Engenharia de Sistemas e Computadores do Porto, Universidade do Porto) | Ribeiro, Cristina (Instituto de Engenharia de Sistemas e Computadores do Porto)
Link analysis is a technique frequently used in the ranking of web sites. On the web, we often encounter content that is organized by entries, sorted from recent to old, and generally follows the structure of a blog. In this paper we explore and evaluate the usage of a bibliometrics measure, called h-index, for the task of blog ranking, in an information retrieval context. We base our experiments on the TREC Blogs08 collection, which comprises over 28 million posts. The results obtained indicate that the h-index is a robust metric that allows for an improved relevance discrimination between blogs, when compared to the in-degree. Additionally, tests performed using distinct versions of the post graph, indicate that this metric might tolerate a certain level of link clutter.
Analyzing Political Trends in the Blogosphere
Demartini, Gianluca (L3S Research Center) | Siersdorfer, Stefan (L3S Research Center) | Chelaru, Sergiu (L3S Research Center) | Nejdl, Wolfgang (L3S Research Center)
In the last years, the blogosphere has become a vital part of the web, covering a variety of different points of view and opinions on political and event-related topics such as immigration, election campaigns, or economic developments. Tracking the public opinion is usually done by conducting surveys resulting in significant costs both for interviewers and persons consulted. In this paper, we propose a method for extracting political trends in the blogosphere.To this end, we apply sentiment and time series analysis techniques in combination with aggregation methods on blog data to estimate the temporal development of opinions on politicians.
A Bootstrapping Approach to Identifying Relevant Tweets for Social TV
Dan, Ovidiu (Lehigh University) | Feng, Junlan (AT&T Labs Research) | Davison, Brian D. (Lehigh University)
Manufacturers of TV sets have recently started adding social media features to their products. Some of these products display microblogging messages relevant to the TV show which the user is currently watching. However, such systems suffer from low precision and recall when they use the title of the show to search for relevant messages. Titles of some popular shows such as Lost or Survivor are highly ambiguous, resulting in messages unrelated to the show. Thus, there is a need to develop filtering algorithms that can achieve both high precision and recall. Filtering microblogging messages for Social TV poses several challenges, including lack of training data, lack of proper grammar and capitalization, lack of context due to text sparsity, etc. We describe a bootstrapping algorithm which uses a small manually labeled dataset, a large dataset of unlabeled messages, and some domain knowledge to derive a high precision classifier that can successfully filter microblogging messages which discuss television shows. The classifier is designed to generalize to TV shows which were not part of the training set. The algorithm achieves high precision on our two test datasets and successfully generalizes to unseen television shows. Furthermore, it compares favorably to a text classifier specifically trained on the television shows used for testing.