When Style Breaks Safety: Defending LLMs Against Superficial Style Alignment

Open in new window