SEA-SafeguardBench: Evaluating AI Safety in SEA Languages and Cultures

Tasawong, Panuthep, Ngui, Jian Gang, Aji, Alham Fikri, Cohn, Trevor, Limkonchotiwat, Peerat

Dec-8-2025–arXiv.org Artificial Intelligence

Safeguard models help large language models (LLMs) detect and block harmful content, but most evaluations remain English-centric and overlook linguistic and cultural diversity. Existing multilingual safety benchmarks often rely on machine-translated English data, which fails to capture nuances in low-resource languages. Southeast Asian (SEA) languages are underrepresented despite the region's linguistic diversity and unique safety concerns, from culturally sensitive political speech to region-specific misinformation. Addressing these gaps requires benchmarks that are natively authored to reflect local norms and harm scenarios. We introduce SEA-SafeguardBench, the first human-verified safety benchmark for SEA, covering eight languages, 21,640 samples, across three subsets: general, in-the-wild, and content generation. The experimental results from our benchmark demonstrate that even state-of-the-art LLMs and guardrails are challenged by SEA cultural and harm scenarios and underperform when compared to English texts.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Dec-8-2025

arXiv.org PDF

Add feedback

Country:
- Asia (1.00)
- North America > United States (0.93)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Health & Medicine (0.92)
- Law > Criminal Law (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning
    - Neural Networks > Deep Learning (1.00)
    - Performance Analysis > Accuracy (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found