"It's Not Just Hate'': A Multi-Dimensional Perspective on Detecting Harmful Speech Online

Bianchi, Federico, Hills, Stefanie Anja, Rossini, Patricia, Hovy, Dirk, Tromble, Rebekah, Tintarev, Nava

Oct-27-2022–arXiv.org Artificial Intelligence

Well-annotated data is a prerequisite for good Natural Language Processing models. Too often, though, annotation decisions are governed by optimizing time or annotator agreement. We make a case for nuanced efforts in an interdisciplinary setting for annotating offensive online speech. Detecting offensive content is rapidly becoming one of the most important real-world NLP tasks. However, most datasets use a single binary label, e.g., for hate or incivility, even though each concept is multi-faceted. This modeling choice severely limits nuanced insights, but also performance. We show that a more fine-grained multi-label approach to predicting incivility and hateful or intolerant content addresses both conceptual and performance issues. We release a novel dataset of over 40,000 tweets about immigration from the US and UK, annotated with six labels for different aspects of incivility and intolerance. Our dataset not only allows for a more nuanced understanding of harmful speech online, models trained on it also outperform or match performance on benchmark datasets.

artificial intelligence, dataset, natural language, (18 more...)

arXiv.org Artificial Intelligence

Oct-27-2022

arXiv.org PDF

Add feedback

Country:
- North America
  - Mexico (0.14)
  - United States
    - District of Columbia > Washington (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - California > Santa Clara County
      - Stanford (0.04)
- Europe
  - United Kingdom > England
    - Merseyside > Liverpool (0.04)
  - Netherlands > Limburg
    - Maastricht (0.04)
  - Italy > Lombardy
    - Milan (0.04)

Genre:
- Research Report (0.82)

Industry:
- Government > Immigration & Customs (1.00)
- Information Technology > Services (0.94)

Technology:
- Information Technology > Artificial Intelligence > Natural Language (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found