"Is Hate Lost in Translation?": Evaluation of Multilingual LGBTQIA+ Hate Speech Detection

Chan, Fai Leui, Nguyen, Duke, Joshi, Aditya

Oct-23-2024–arXiv.org Artificial Intelligence

This paper explores the challenges of detecting LGBTQIA+ hate speech of large language models across multiple languages, including English, Italian, Chinese and (code-switched) English-Tamil, examining the impact of machine translation and whether the nuances of hate speech are preserved across translation. We examine the hate speech detection ability of zero-shot and fine-tuned GPT. Our findings indicate that: (1) English has the highest performance and the code-switching scenario of English-Tamil being the lowest, (2) fine-tuning improves performance consistently across languages whilst translation yields mixed results. Through simple experimentation with original text and machine-translated text for hate speech detection along with a qualitative error analysis, this paper sheds light on the socio-cultural nuances and complexities of languages that may not be captured by automatic translation.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Oct-23-2024

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - New South Wales > Sydney (0.04)
- North America
  - United States > Virginia (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Spain > Andalusia
    - Jaén Province > Jaén (0.04)
  - Middle East > Malta
    - Eastern Region > Northern Harbour District > St. Julian's (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Bulgaria > Varna Province
    - Varna (0.04)
- Asia
  - Singapore (0.04)
  - India > West Bengal
    - Kolkata (0.04)

Genre:
- Research Report (0.84)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.95)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found