An Embarrassingly Simple Defense Against LLM Abliteration Attacks

Open in new window