Asia
A Sentiment-Aware Approach to Community Formation in Social Media
Nguyen, Thin (Deakin University) | Phung, Dinh (Deakin University) | Adams, Brett (Curtin University) | Venkatesh, Svetha (Deakin University)
Participating in a community exemplifies the aspect of sharing, networking and interacting in a social media system. There has been extensive work on characterising on-line communities by their contents and tags using topic modelling tools. However, the role of sentiment and mood has not been studied. Arguably, mood is an integral feature of a text, and becomes more significant in the context of social media: two communities might discuss precisely the same topics, yet within an entirely different atmosphere. Such sentiment-related distinctions are important for many kinds of analysis and applications, such as community recommendation. We present a novel approach to identification of latent hyper-groups in social communities based on users’ sentiment. The results show that a sentiment-based approach can yield useful insights into community formation and meta-communities, having potential applications in, for example, mental health—by targeting support or surveillance to communities with negative mood—or in marketing—by targeting customer communities having the same sentiment on similar topics.
Do You Feel What I Feel? Social Aspects of Emotions in Twitter Conversations
Kim, Suin (KAIST) | Bak, JinYeong (KAIST) | Oh, Alice Haeyun (KAIST)
We present a computational framework for understanding the social aspects of emotions in Twitter conversations. Using unannotated data and semisupervised machine learning, we look at emotional transitions, emotional influences among the conversation partners, and patterns in the overall emotional exchanges. We find that conversational partners usually express the same emotion, which we name Emotion accommodation, but when they do not, one of the conversational partners tends to respond with a positive emotion. We also show that tweets containing sympathy, apology, and complaint are significant emotion influencers. We verify the emotion classification part of our framework by a human-annotated corpus.
Tracking Sentiment and Topic Dynamics from Social Media
He, Yulan (The Open University) | Lin, Chenghua (The Open University ) | Gao, Wei (Qatar Foundation) | Wong, Kam-Fai (The Chinese University of Hong Kong)
We propose a dynamic joint sentiment-topic model (dJST) which allows the detection and tracking of views of current and recurrent interests and shifts in topic and sentiment. Both topic and sentiment dynamics are captured by assuming that the current sentiment-topic specific word distributions are generated according to the word distributions at previous epochs. We derive efficient online inference procedures to sequentially update the model with newly arrived data and show the effectiveness of our proposed model on the Mozilla add-on reviews crawled between 2007 and 2011.
Using Group Membership Markers for Group Identification
Gawron, Jean Mark (San Diego State University) | Gupta, Dipak (San Diego State University) | Stephens, Kellen (San Diego State University) | Tsou, Ming-Hsiang (San Diego State University) | Spitzberg, Brian (San Diego State University) | An, Li (San Diego State University)
We describe a system for automatically ranking documents by degree of militancy, designed as a tool both for finding militant websites and prioritizing the data found. We compare three ranking systems, one employing a small hand-selected vocabulary based on group membership markers used by insiders to identify members and member properties (us) and outsiders and threats (them), one with a much larger vocabulary, and another with a small vocabulary chosen by Mutual Information. We use the same vocabularies to build classifiers. The ranker that achieves the best correlations with human judgments uses the small us-them vocabulary. We confirm and extend recent results in sentiment analysis (paltoglou 2010), showing that a feature-weighting scheme taken from classical IR (TFIDF) produces the best ranking system; we also find, surprisingly, that adjusting these weights with SVM training, while producing a better classifier, produces a worse ranker. Increasing vocabulary size similarly improves classification (while worsening ranking).
Epidemic Intelligence for the Crowd, by the Crowd
Diaz-Aviles, Ernesto (University of Hannover) | Stewart, Avaré (University of Hannover) | Velasco, Edward (Robert Koch Institute) | Denecke, Kerstin (University of Hannover) | Nejdl, Wolfgang (University of Hannover)
Tracking Twitter for public health has shown great potential. However, most recent work has been focused on correlating Twitter messages to influenza rates, a disease that exhibits a marked seasonal pattern. In the presence of sudden outbreaks, how can social media streams be used to strengthen surveillance capacity? In May 2011, Germany reported an outbreak of Enterohemorrhagic Escherichia coli (EHEC). It was one of the largest described outbreaks of EHEC worldwide and the largest in Germany. In this work, we study the crowd's behavior in Twitter during the outbreak. In particular, we report how tracking Twitter helped to detect key user messages that triggered signal detection alarms before MedISys and other well established early warning systems. We also introduce a personalized learning to rank approach that exploits the relationships discovered by: (i) latent semantic topics computed using Latent Dirichlet Allocation (LDA), and (ii) observing the social tagging behavior in Twitter, to rank tweets for epidemic intelligence. Our results provide the grounds for new public health research based on social media.
Identifying Microblogs for Targeted Contextual Advertising
Dave, Kushal Shailesh (International Institute of Information Technology, Hyderabad) | Varma, Vasudeva (International Institute of Information Technology, Hyderabad)
Micro-blogging sites such as Facebook, Twitter, Google+ present a nice opportunity for targeting advertisements that are contextually related to the microblog content. By virtue of the sparse and noisy text makes identifying the microblogs suitable for advertising a very hard problem. In this work, we approach the problem of identifying the microblogs that could be targeted for advertisements as a two-step classification approach. In the first pass, microblogs suitable for advertising are identified. Next, in the second pass, we build a model to find the sentiment of the advertisable microblog. The systems use features derived from the Part-of-speech tags, the tweet content and uses external resources such as query logs and n-gram dictionaries from previously labeled data.This work aims at providing a thorough insight into the problem and analyzing various features to assess which features contribute the most towards identifying the tweets that can be targeted for advertisements.
Visualizing Topic Models
Chaney, Allison June-Barlow (Princeton University) | Blei, David M. (Princeton University)
Managing large collections of documents is an important problem for many areas of science, industry, and culture. Probabilistic topic modeling offers a promising solution. Topic modeling is an unsupervised machine learning method that learns the underlying themes in a large collection of otherwise unorganized documents. This discovered structure summarizes and organizes the documents. However, topic models are high-level statistical tools—a user must scrutinize numerical distributions to understand and explore their results. In this paper, we present a method for visualizing topic models. Our method creates a navigator of the documents, allowing users to explore the hidden structure that a topic model discovers. These browsing interfaces reveal meaningful patterns in a collection, helping end-users explore and understand its contents in new ways. We provide open source software of our method.
Global Dynamics of Online Group Conversations
Bhatt, Rushi (Yahoo! Labs) | Barman, Kishor (Tata Institute of Fundamental Research)
Public online groups allow individuals to carry out conver- sations of common interests. Study of such group conversa- tions provides a unique opportunity to study patterns of hu- man conversations without violating individual privacy. The observational studies conducted in this paper are an attempt to identify the main correlates of continued growth of con- versations, thereby clearing the path to developing predictive models user participation. We study temporal evolution of online group discussions. Surprisingly, we find that individual discussion groups dis- play distinctively q-exponential shaped inter-message times to reply distributions, unlike the power law distributions seen in email conversations. We show, using simulations, that the heavy-tailed distribution of time to reply, which we also ob- serve when all data is combined, originate from mixtures of q-exponentials. We also find that popular threads come to be so from the very beginning as opposed to evolving to be more popular as they grow. This raises new possibilities for devel- oping generative models of thread growth.
An Evaluation of the Role of Sentiment in Second Screen Microblog Search Tasks
Bermingham, Adam (Dublin City University) | Smeaton, Alan F (Dublin City University)
The recent prominence of the real-time web is proving both challenging and disruptive for information retrieval and web data mining research. User-generated content on the real-time web is perhaps best epitomised by content on microblogging platforms, such as Twitter. Given the substantial quantity of microblog posts that may be relevant to a user's query at a point in time, automated methods are required to sift through this information. Sentiment analysis offers a promising direction for modelling microblog content. We build and evaluate a sentiment-based filtering system using real-time user studies. We find a significant role played by sentiment in the search scenarios, observing detrimental effects in filtering out certain sentiment types. We make a series of observations regarding associations between document-level sentiment and user feedback, including associations with user profile attributes, and users' prior topic sentiment.
More or Less: Amount of Personal Information Displayed in Social Network Site Profiles and Its Impact on Viewers’ Intentions to Socialize with the Profile Owner
Baruh, Lemi (Koc University) | Chisik, Yoram (University of Madeira) | Bisson, Christophe (Kadir Has University) | Senova, Basak (NOMAD)
This paper presents the results of an experiment that employed a 2 (low vs. high information) by 2 (male vs. female profile) design to investigate the relationship between amount of information displayed in a Social Network Site (SNS) profile and profile viewers’ intentions to engage in further social interactions (communicate online, add to SNS profile, and meet face-to-face) with the profile owner. The results indicate that more information increases the likelihood of relationship initiation for male profiles but decreases it for female profiles. Also, viewers are inclined to initiate an interaction when less information is presented in an SNS profile of a person from the opposite sex; but require more information from their own sex.