Dinakar, Karthik (Massachusetts Institute of Technology) | Jones, Birago (Massachusetts Institute of Technology) | Lieberman, Henry (Massachusetts Institute of Technology) | Picard, Rosalind (Massachusetts Institute of Technology) | Rose, Carolyn (Carnegie Mellon University) | Thoman, Matthew (Northeastern University) | Reichart, Roi (Massachusetts Institute of Technology)
Adolescent cyber-bullying on social networks is a phenomenon that has received widespread attention. Recent work by sociologists has examined this phenomenon under the larger context of teenage drama and it's manifestations on social networks. Tackling cyber-bullying involves two key components – automatic detection of possible cases, and interaction strategies that encourage reflection and emotional support. Key is showing distressed teenagers that they are not alone in their plight. Conventional topic spotting and document classification into labels like "dating" or "sports" are not enough to effectively match stories for this task. In this work, we examine a corpus of 5500 stories from distressed teenagers from a major youth social network. We combine Latent Dirichlet Allocation and human interpretation of its output using principles from sociolinguistics to extract high-level themes in the stories and use them to match new stories to similar ones. A user evaluation of the story matching shows that theme-based retrieval does a better job of finding relevant and effective stories for this application than conventional approaches.