Detecting Adversarial Fine-tuning with Auditing Agents

Open in new window