SynBullying: A Multi LLM Synthetic Conversational Dataset for Cyberbullying Detection
Kazemi, Arefeh, Qadeer, Hamza, Wagner, Joachim, Hosseini, Hossein, Kalaivendan, Sri Balaaji Natarajan, Davis, Brian
–arXiv.org Artificial Intelligence
We introduce SynBullying, a synthetic multi-LLM conversational dataset for studying and detecting cyberbullying (CB). SynBullying provides a scalable and ethically safe alternative to human data collection by leveraging large language models (LLMs) to simulate realistic bullying interactions. The dataset offers (i) conversational structure, capturing multi-turn exchanges rather than isolated posts; (ii) context-aware annotations, where harmfulness is assessed within the conversational flow considering context, intent, and discourse dynamics; and (iii) fine-grained labeling, covering various CB categories for detailed linguistic and behavioral analysis. We evaluate SynBullying across five dimensions, including conversational structure, lexical patterns, sentiment/toxicity, role dynamics, harm intensity, and CB-type distribution. We further examine its utility by testing its performance as standalone training data and as an augmentation source for CB classification.
arXiv.org Artificial Intelligence
Dec-10-2025
- Country:
- Asia
- Middle East > Iran
- Isfahan Province > Isfahan (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- Middle East > Iran
- Europe
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Bulgaria > Varna Province
- Varna (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Belgium > Brussels-Capital Region
- North America
- Dominican Republic (0.04)
- Puerto Rico > Peñuelas
- Peñuelas (0.04)
- United States
- Florida > Miami-Dade County
- Miami (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New York > New York County
- New York City (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- Florida > Miami-Dade County
- Asia
- Genre:
- Research Report (1.00)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Law (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.64)
- Technology: