Does Safety Training of LLMs Generalize to Semantically Related Natural Prompts?

Open in new window