Stealthy and Persistent Unalignment on Large Language Models via Backdoor Injections

Open in new window