dictNN: A Dictionary-Enhanced CNN Approach for Classifying Hate Speech on Twitter

Kupi, Maximilian, Bodnar, Michael, Schmidt, Nikolas, Posada, Carlos Eduardo

Mar-15-2021–arXiv.org Artificial Intelligence

Hate speech on social media is a growing concern, and automated methods have so far been sub-par at reliably detecting it. A major challenge lies in the potentially evasive nature of hate speech due to the ambiguity and fast evolution of natural language. To tackle this, we introduce a vectorisation based on a crowd-sourced and continuously updated dictionary of hate words and propose fusing this approach with standard word embedding in order to improve the classification performance of a CNN model. To train and test our model we use a merge of two established datasets (110,748 tweets in total). By adding the dictionary-enhanced input, we are able to increase the CNN model's predictive power and increase the F1 macro score by seven percentage points.

computational linguistic, proceedings, tweet, (13 more...)

arXiv.org Artificial Intelligence

Mar-15-2021

arXiv.org PDF

Add feedback

Country:
- South America > Brazil (0.04)
- North America
  - United States
    - New York > New York County
      - New York City (0.04)
    - Massachusetts > Suffolk County
      - Boston (0.04)
    - California > San Diego County
      - San Diego (0.04)
  - Canada > Quebec
    - Montreal (0.04)
- Europe
  - Germany > Berlin (0.04)
  - Spain > Valencian Community
    - Valencia Province > Valencia (0.04)
  - Italy > Tuscany
    - Florence (0.05)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia
  - Middle East > Qatar
    - Ad-Dawhah > Doha (0.04)
  - Japan > Kyūshū & Okinawa
    - Okinawa (0.04)

Genre:
- Overview (0.66)
- Research Report (0.64)

Industry:
- Information Technology (0.46)

Technology:
- Information Technology
  - Communications > Social Media
    - Crowdsourcing (0.66)
  - Artificial Intelligence > Machine Learning
    - Neural Networks > Deep Learning (0.69)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found