Stealthy Jailbreak Attacks on Large Language Models via Benign Data Mirroring

Open in new window