Improved Large Language Model Jailbreak Detection via Pretrained Embeddings

Open in new window