STOP! Benchmarking Large Language Models with Sensitivity Testing on Offensive Progressions

Open in new window