LatentBreak: Jailbreaking Large Language Models through Latent Space Feedback

Open in new window