OR-Bench: An Over-Refusal Benchmark for Large Language Models

Open in new window