Defining, Understanding, and Detecting Online Toxicity: Challenges and Machine Learning Approaches
Shahi, Gautam Kishore, Majchrzak, Tim A.
–arXiv.org Artificial Intelligence
Online toxic content has grown into a pervasive phenomenon, intensifying during times of crisis, elections, and social unrest. A significant amount of research has been focused on detecting or analyzing toxic content using machine-learning approaches. The proliferation of toxic content across digital platforms has spurred extensive research into automated detection mechanisms, primarily driven by advances in machine learning and natural language processing. Overall, the present study represents the synthesis of 140 publications on different types of toxic content on digital platforms. We present a comprehensive overview of the datasets used in previous studies focusing on definitions, data sources, challenges, and machine learning approaches employed in detecting online toxicity, such as hate speech, offensive language, and harmful discourse. The dataset encompasses content in 32 languages, covering topics such as elections, spontaneous events, and crises. We examine the possibility of using existing cross-platform data to improve the performance of classification models. We present the recommendations and guidelines for new research on online toxic consent and the use of content moderation for mitigation. Finally, we present some practical guidelines to mitigate toxic content from online platforms.
arXiv.org Artificial Intelligence
Sep-19-2025
- Country:
- Asia
- China
- India (0.04)
- Japan > Kyūshū & Okinawa
- Kyūshū > Miyazaki Prefecture > Miyazaki (0.04)
- Middle East > Israel (0.04)
- Russia (0.04)
- Europe
- Hungary (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Kosovo (0.04)
- United Kingdom (0.04)
- Czechia (0.04)
- Germany > Saxony-Anhalt (0.04)
- Italy > Veneto
- Venice (0.04)
- Greece (0.04)
- Russia (0.04)
- Switzerland (0.04)
- Spain (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Norway > Southern Norway
- Agder > Kristiansand (0.04)
- North America
- Canada > British Columbia
- United States
- Hawaii (0.04)
- New York > New York County
- New York City (0.04)
- South America > Ecuador (0.04)
- Asia
- Genre:
- Overview (1.00)
- Research Report > New Finding (0.66)
- Industry:
- Government > Regional Government (1.00)
- Health & Medicine (1.00)
- Information Technology
- Security & Privacy (0.94)
- Services (1.00)
- Law (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Media > News (1.00)
- Technology: