Using Crowdsourcing to Improve Profanity Detection

Sood, Sara Owsley (Pomona College) | Antin, Judd (Yahoo! Research) | Churchill, Elizabeth (Yahoo! Research)

Mar-25-2012–AAAI Conferences

Profanity detection is often thought to be an easy task. However, past work has shown that current, list-based systems are performing poorly. They fail to adapt to evolving profane slang, identify profane terms that have been disguised or only partially censored (e.g., @ss, f$#%) or intentionally or unintentionally misspelled (e.g., biatch, shiiiit). For these reasons, they are easy to circumvent and have very poor recall. Secondly, they are a one-size fits all solution – making assumptions that the definition, use and perceptions of profane or inappropriate holds across all contexts. In this article, we present work that attempts to move beyond list-based profanity detection systems by identifying the context in which profanity occurs. The proposed system uses a set of comments from a social news site labeled by Amazon Mechanical Turk workers for the presence of profanity. This system far surpasses the performance of list-based profanity detection techniques. The use of crowdsourcing in this task suggests an opportunity to build profanity detection systems tailored to sites and communities.

crowdsourcing, profanity, social media, (18 more...)

AAAI Conferences

Mar-25-2012

Conferences PDF

Add feedback

Country:
- North America > United States
  - California (0.14)
  - Nevada (0.14)

Industry:
- Law > Civil Rights & Constitutional Law (0.34)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Natural Language (1.00)
  - Communications > Social Media
    - Crowdsourcing (0.93)
  - Data Science (0.94)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found