A Weakly Supervised Classifier and Dataset of White Supremacist Language

Yoder, Michael Miller, Diab, Ahmad, Brown, David West, Carley, Kathleen M.

arXiv.org Artificial Intelligence 

We present a dataset and classifier for detecting the language of white supremacist extremism, a growing issue in online hate speech. Our weakly supervised classifier is trained on large datasets of text from explicitly white supremacist domains paired with neutral and anti-racist data from similar domains. We demonstrate that this approach improves generalization performance to new domains. Incorporating anti-racist texts as counterexamples to white supremacist language mitigates bias.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found