XGUARD: A Graded Benchmark for Evaluating Safety Failures of Large Language Models on Extremist Content

Open in new window