Dinakar, Karthik (Massachusetts Institute of Technology) | Jones, Birago (Massachusetts Institute of Technology) | Lieberman, Henry (Massachusetts Institute of Technology) | Picard, Rosalind (Massachusetts Institute of Technology) | Rose, Carolyn (Carnegie Mellon University) | Thoman, Matthew (Northeastern University) | Reichart, Roi (Massachusetts Institute of Technology)
Adolescent cyber-bullying on social networks is a phenomenon that has received widespread attention. Recent work by sociologists has examined this phenomenon under the larger context of teenage drama and it's manifestations on social networks. Tackling cyber-bullying involves two key components – automatic detection of possible cases, and interaction strategies that encourage reflection and emotional support. Key is showing distressed teenagers that they are not alone in their plight. Conventional topic spotting and document classification into labels like "dating" or "sports" are not enough to effectively match stories for this task. In this work, we examine a corpus of 5500 stories from distressed teenagers from a major youth social network. We combine Latent Dirichlet Allocation and human interpretation of its output using principles from sociolinguistics to extract high-level themes in the stories and use them to match new stories to similar ones. A user evaluation of the story matching shows that theme-based retrieval does a better job of finding relevant and effective stories for this application than conventional approaches.
The internet has become a resource for adolescents who are distressed by social and emotional problems. Social network analysis can provide new opportunities for helping people seeking support online, but only if we understand the salient issues that are highly relevant to participants personal circumstances. In this paper, we present a stacked generalization modeling approach to analyze an online community supporting adolescents under duress. While traditional predictive supervised methods rely on robust hand-crafted feature space engineering, mixed initiative semi-supervised topic models are often better at extracting high-level themes that go beyond such feature spaces. We present a strategy that combines the strengths of both these types of models inspired by Prevention Science approaches which deals with the identification and amelioration of risk factors that predict to psychological, psychosocial, and psychiatric disorders within and across populations (in our case teenagers) rather than treat them post-facto. In this study, prevention scientists used a social science thematic analytic approach to code stories according to a fine-grained analysis of salient social, developmental or psychological themes they deemed relevant, and these are then analyzed by a society of models. We show that a stacked generalization of such an ensemble fares better than individual binary predictive models.
We present an approach for cyberbullying detection based on state-of-the-art text classification and a common sense knowledge base, which permits recognition over a broad spectrum of topics in everyday life. We analyze a more narrow range of particular subject matter associated with bullying and construct BullySpace, a common sense knowledge base that encodes particular knowledge about bullying situations. We then perform joint reasoning with common sense knowledge about a wide range of everyday life topics. We analyze messages using our novel AnalogySpace common sense reasoning technique. We also take into account social network analysis and other factors. We evaluate the model on real-world instances that have been reported by users on Form spring, a social networking website that is popular with teenagers. On the intervention side, we explore a set of reflective user interaction paradigms with the goal of promoting empathy among social network participants. We propose an air traffic control-like dashboard, which alerts moderators to large-scale outbreaks that appear to be escalating or spreading and helps them prioritize the current deluge of user complaints. For potential victims, we provide educational material that informs them about how to cope with the situation, and connects them with emotional support from others. A user evaluation shows that in context, targeted, and dynamic help during cyberbullying situations fosters end-user reflection that promotes better coping strategies.
The scourge of cyberbullying has assumed alarming proportions with an ever-increasing number of adolescents admitting to having dealt with it either as a victim or as a bystander. Anonymity and the lack of meaningful supervision in the electronic medium are two factors that have exacerbated this social menace. Comments or posts involving sensitive topics that are personal to an individual are more likely to be internalized by a victim, often resulting in tragic outcomes. We decompose the overall detection problem into detection of sensitive topics, lending itself into text classification sub-problems. We experiment with a corpus of 4500 YouTube comments, applying a range of binary and multiclass classifiers. We find that binary classifiers for individual labels outperform multiclass classifiers. Our findings show that the detection of textual cyberbullying can be tackled by building individual topic-sensitive classifiers.