Not enough data to create a plot.
Try a different view from the menu above.
Information Technology
The Length of Bridge Ties: Structural and Geographic Properties of Online Social Interactions
Volkovich, Yana (Barcelona Media Foundation) | Scellato, Salvatore (University of Cambridge) | Laniado, David (Barcelona Media Foundation) | Mascolo, Cecilia (University of Cambridge) | Kaltenbrunner, Andreas (Barcelona Media Foundation)
The popularity of the Web has allowed individuals to communicate and interact with each other on a global scale: people connect both to close friends and acquaintances, creating ties that can bridge otherwise separated groups of people. Recent evidence suggests that spatial distance is still affecting social links established on online platforms, with online ties preferentially connecting closer people. In this work we study the relationships between interaction strength, spatial distance and structural position of ties between members of a large-scale online social networking platform, Tuenti. We discover that ties in highly connected social groups tend to span shorter distances than connections bridging together otherwise separated portions of the network. We also find that such bridging connections have lower social interaction levels than ties within the inner core of the network and ties connecting to its periphery. Our results suggest that spatial constraints on online social networks are intimately connected to structural network properties, with important consequences for information diffusion.
Who Does What on the Web: A Large-Scale Study of Browsing Behavior
Goel, Sharad (Yahoo! Research) | Hofman, Jake M. (Yahoo! Research) | Sirer, M. Irmak (Northwestern University)
As the Web has become integrated into daily life, understanding how individuals spend their time online impacts domains ranging from public policy to marketing. It is difficult, however, to measure even simple aspects of browsing behavior via conventional methods---including surveys and site-level analytics---due to limitations of scale and scope. In part addressing these limitations, large-scale Web panel data are a relatively novel means for investigating patterns of Internet usage. In one of the largest studies of browsing behavior to date, we pair Web histories for 250,000 anonymized individuals with user-level demographics---including age, sex, race, education, and income---to investigate three topics. First, we examine how behavior changes as individuals spend more time online, showing that the heaviest users devote nearly twice as much of their time to social media relative to typical individuals. Second, we revisit the digital divide, finding that the frequency with which individuals turn to the Web for research, news, and healthcare is strongly related to educational background, but not as closely tied to gender and ethnicity. Finally, we demonstrate that browsing histories are a strong signal for inferring user attributes, including ethnicity and household income, a result that may be leveraged to improve ad targeting.
Cultural Analytics of Large Datasets from Flickr
Ushizima, Daniela (Lawrence Berkeley National Laboratory) | Manovich, Lev (University of California, San Diego) | Margolis, Todd (University of California, San Diego) | Douglas, Jeremy (Ashford University)
Deluge became a metaphor to describe the amount of information to which we are subjected, and very often we feel we are drowning while our access to information is rising. Devising mechanisms for exploring massive image sets according to perceptual attributes is still a challenge, even more when dealing with user-generated social media content. Such images tend to be heterogenous, and using metadata-only can be misleading. This paper describes a set of tools designed to analyze large sets of user-created art related images using image features describing color, texture, composition and orientation. The proposed pipeline permits to discriminate Flickr groups in terms of feature vectors and clustering parameters. The algorithms are general enough to be applied to other domains in which the main question is about the variability of the images.
Happy, Nervous or Surprised? Classification of Human Affective States in Social Media
Choudhury, Munmun De (Microsoft Research, Redmond) | Gamon, Michael (Microsoft Research, Redmond) | Counts, Scott (Microsoft Research, Redmond)
Sentiment classification has been a well-investigated research area in the computational linguistics community. However, most of the research is primarily focused on detecting simply the polarity in text, often needing extensive manual labeling of ground truth. Additionally, little attention has been directed towards a finer analysis of human moods and affective states. Motivated by research in psychology, we propose and develop a classifier of several human affective states in social media. Starting with about 200 moods, we utilize mechanical turk studies to derive naturalistic signals from posts shared on Twitter about a variety of affects of individuals. This dataset is then deployed in an affect classification task with promising results. Our findings indicate that different types of affect involve different emotional content and usage styles; hence the performance of the classifier on various affects can differ considerably.
Enhancing Event Descriptions through Twitter Mining
Tanev, Hristo (Joint Research Centre, European Commission) | Ehrmann, Maud (Joint Research Centre, European Commission) | Piskorski, Jakub (Frontex) | Zavarella, Vanni (Joint Research Centre, European Commission)
We describe a simple IR approach for linking news about events, detected by an event extraction system, to messages from Twitter (tweets). In particular, we explore several methods for creating event-specific queries for Twitter and provide a quantitative and qualitative evaluation of the relevance and usefulness of the information obtained from the tweets. We showed that methods based on utilization of word co-occurrence clustering, domain-specific keywords and named entity recognition improve the performance with respect to a basic approach.
Modeling Diffusion in Social Networks Using Network Properties
Luu, Duc Minh (Singapore Management University) | Lim, Ee-Peng (Singapore Management University) | Hoang, Tuan-Anh (Singapore Management University) | Chua, Freddy Chong Tat (Singapore Management University)
Diffusion of items occurs in social networks due to spreading of items through word of mouth and exogenous factors. These items may be news, products, videos, advertisements or contagious viruses. Previous research has studied diffusion process at both the macro and micro levels. The former models the number of item adopters in the diffusion process while the latter determines which individuals adopt item. In this paper, we establish a general probabilistic framework, which can be used to derive macro-level diffusion models, including the well known Bass Model (BM). Using this framework, we develop several other models considering the social networkโs degree distribution coupled with the assumption of linear influence by neighboring adopters in the diffusion process. Through some evaluation on synthetic data, this paper shows that degree distribution actually changes during the diffusion process. We therefore introduce a multi-stage diffusion model to cope with variable degree distribution. By conducting experiments on both synthetic and real datasets, we show that our proposed diffusion models can recover the diffusion parameters from the observed diffusion data, which allows us to model diffusion with high accuracy.
Trendminer: An Architecture for Real Time Analysis of Social Media Text
Preotiuc-Pietro, Daniel (University of Sheffield) | Samangooei, Sina (University of Southampton) | Cohn, Trevor (University of Southampton) | Gibbins, Nicholas (University of Sheffield) | Niranjan, Mahesan (University of Southampton)
The emergence of online social networks (OSNs) and the accompanying availability of large amounts of data, pose a number of new natural language processing (NLP) and computational challenges. Data from OSNs is different to data from traditional sources (e.g. newswire). The texts are short, noisy and conversational. Another important issue is that data occurs in a real-time streams, needing immediate analysis that is grounded in time and context. In this paper we describe a new open-source framework for efficient text processing of streaming OSN data (available at www.trendminer-project.eu). Whilst researchers have made progress in adapting or creating text analysis tools for OSN data, a system to unify these tasks has yet to be built. Our system is focused on a real world scenario where fast processing and accuracy is paramount. We use the MapReduce framework for distributed computing and present running times for our system in order to show that scaling to online scenarios is feasible.We describe the components of the system and evaluate their accuracy. Our system supports easy integration of future modules in order to extend its functionality.
Feasibility Study on Detection of Transportation Information Exploiting Twitter as a Sensor
Sasaki, Kenta (Toshiba Corporation) | Nagano, Shinichi (Toshiba Corporation) | Ueno, Koji (Toshiba Corporation) | Cho, Kenta (Toshiba Corporation)
The concept of a smart community has recently been attracting great attention as a means of utilizing energy effectively. One of the modules constituting the smart community is an intelligent transportation system, in which various sensors track movements of people and vehicles in real time to optimize migration pathways or means. Social media have the potential to serve as sensors, since people often post transportation information on such media. This paper presents a feasibility study on detecting information, focusing on train status information, by exploiting Twitter as a sensor. We dealt with two issues: (1) for the ambiguity of textual information expressed in tweets, we utilized heuristic rules in text manipulation, and (2) for the differences in the numbers of tweets among train lines, we optimized parameter values in statistical analysis for each train line. The experimental results show that the F-measure of detecting the information was more than 0.85 and the time taken to detect the information was less than 4 minutes. As a result we confirmed the high potential of detecting transportation information through Twitter.
A Supervised Approach to Predict Company Acquisition with Factual and Topic Features Using Profiles and News Articles on TechCrunch
Xiang, Guang (Carnegie Mellon University) | Zheng, Zeyu (Carnegie Mellon University) | Wen, Miaomiao (Carnegie Mellon University) | Hong, Jason (Carnegie Mellon University) | Rose, Carolyn (Carnegie Mellon University) | Liu, Chao (Microsoft Research)
Merger and Acquisition (M&A) prediction has been an interesting and challenging research topic in the past a few decades. However, past work has only adopted numerical features in building models, and yet the valuable textual information from the great variety of social media sites has not been touched at all. To fully explore this information, we used the profiles and news articles for companies and people on TechCrunch, the leading and largest public database for the tech world, which anybody can edit. Specifically, we explored topic features via topic modeling techniques, as well as a set of other novel features of our design within a machine learning framework. We conducted experiments of the largest scale in the literature, and achieved a high true positive rate (TP) between 60% to 79.8% with a false positive rate (FP) mostly between 0% and 8.3% over company categories with a small number of missing attributes in the CrunchBase profiles.
The YouTube Social Network
Wattenhofer, Mirjam (Google Zurich) | Wattenhofer, Roger (ETH Zurich) | Zhu, Zack (ETH Zurich)
Today, YouTube is the largest user-driven video content provider in the world; it has become a major platform for disseminating multimedia information. A major contribution to its success comes from the user-to-user social experience that differentiates it from traditional content broadcasters. This work examines the social network aspect of YouTube by measuring the full-scale YouTube subscription graph, comment graph, and video content corpus. We find YouTube to deviate significantly from network characteristics that mark traditional online social networks, such as homophily, reciprocative linking, and assortativity. However, comparing to reported characteristics of another content-driven online social network, Twitter, YouTube is remarkably similar. Examining the social and content facets of user popularity, we find a stronger correlation between a user's social popularity and his/her most popular content as opposed to typical content popularity. Finally, we demonstrate an application of our measurements for classifying YouTube Partners, who are selected users that share YouTube's advertisement revenue. Results are motivating despite the highly imbalanced nature of the classification problem.