Tang, Shiliang (University of California, Santa Barbara) | Liu, Qingyun (University of California, Santa Barbara) | McQueen, Megan (Yale University) | Counts, Scott (Microsoft Research) | Jain, Apurv (Microsoft Research) | Zheng, Haitao (University of California, Santa Barbara) | Zhao, Ben Y. (University of California, Santa Barbara)
We examine the quality of information and communication in online investment discussion boards. We show that positivity bias and skewed risk/reward assessments, exacerbated by the insular nature of the community and its social structure, contribute to underperforming investment advice and unnecessary trading. Discussion post sentiment has negligible correlation with future stock market returns, but does have a positive correlation with trading volumes and volatility. Our trading simulations show that across different timeframes, this misinformation leads 50-70% of users to underperform the market average. We then examine social structure in communities, and show that the majority of market sentiment is produced by a small number of community leaders, and that many members actively resist negative sentiment, thus minimizing viewpoint diversity. To improve generated information content in online investment communities, we suggest designing to increase diversity of opinion, minimize friction around incorporating new information, and provide performance feedback for self-correction.
We analyze data about the micro-blogging site Twitter using sentiment extraction techniques. From an information perspective, Twitter users are involved mostly in two processes: information creation and subsequent distribution (tweeting), and pure information distribution (retweeting), with pronounced preference to the first. However a rather substantial fraction of tweets are retweeted. Here, we address the role of the sentiment expressed in tweets for their potential aftermath. We find that although the overall sentiment (polarity) does not influence the probability of a tweet to be retweeted, a new measure called "emotional divergence" does have an impact. In general, tweets with high emotional diversity have a better chance of being retweeted, hence influencing the distribution of information.
Debate is open as to whether social media communities resemble real-life communities, and to what extent. We contribute to this discussion by testing whether established sociological theories of real-life networks hold in Twitter. In particular, for 228,359 Twitter profiles, we compute network metrics (e.g., reciprocity, structural holes, simmelian ties) that the sociological literature has found to be related to parts of one's social world (i.e., to topics, geography and emotions), and test whether these real-life associations still hold in Twitter. We find that, much like individuals in real-life communities, social brokers (those who span structural holes) are opinion leaders who tweet about diverse topics, have geographically wide networks, and express not only positive but also negative emotions. Furthermore, Twitter users who express positive (negative) emotions cluster together, to the extent of having a correlation coefficient between one's emotions and those of friends as high as 0.45. Understanding Twitter's social dynamics does not only have theoretical implications for studies of social networks but also has practical implications, including the design of self-reflecting user interfaces that make people aware of their emotions, spam detection tools, and effective marketing campaigns.
Kim, Jihie (USC Information Sciences Institiute) | Yoo, Jaebong (USC Information Sciences Institiute) | Lim, Ho (USC Information Sciences Institiute) | Qiu, Huida (USC Information Sciences Institiute ) | Kozareva, Zornitsa (USC Information Sciences Institiute) | Galstyan, Aram (USC Information Sciences Institiute)
Learning sentiment models from short texts such as tweets is a notoriously challenging problem due to very strong noise and data sparsity. This paper presents a novel, collaborative filtering-based approach for sentiment prediction in twitter conversation threads. Given a set of sentiment holders and sentiment targets, we assume we know the true sentiments for a small fraction of holder-target pairs. This information is then used to predict the sentiment of a previously unknown user towards another user or an entity using collaborative filtering algorithms. We validate our model on two Twitter datasets using different collaborative filtering techniques. Our preliminary results demonstrate that the proposed approach can be effectively used in twitter sentiment prediction, thus mitigating the data sparsity problem.
Lerman, Kristina (University of Southern California) | Arora, Megha (Indraprastha Institute of Information Technology) | Gallegos, Luciano (University of Southern California) | Kumaraguru, Ponnurangam (Indraprastha Institute of Information Technology) | Garcia, David (Eidgenössische Technische Hochschule Zürich (ETH-Zurich))
The social connections people form online affect the quality of information they receive and their online experience. Although a host of socioeconomic and cognitive factors were implicated in the formation of offline social ties, few of them have been empirically validated, particularly in an online setting. In this study, we analyze a large corpus of geo-referenced messages, or tweets, posted by social media users from a major US metropolitan area. We linked these tweets to US Census data through their locations. This allowed us to measure emotions expressed in the tweets posted from an area, the structure of social connections, and also use that area's socioeconomic characteristics in analysis. %We extracted the structure of online social interactions from the people mentioned in tweets from that area.We find that at an aggregate level, places where social media users engage more deeply with less diverse social contacts are those where they express more negative emotions, like sadness and anger. Demographics also has an impact: these places have residents with lower household income and education levels. Conversely, places where people engage less frequently but with diverse contacts have happier, more positive messages posted from them and also have better educated, younger, more affluent residents. Results suggest that cognitive factors and offline characteristics affect the quality of online interactions. Our work highlights the value of linking social media data to traditional data sources, such as US Census, to drive novel analysis of online behavior.