Watch your steps: Dormant Adversarial Behaviors that Activate upon LLM Finetuning

Open in new window