Goto

Collaborating Authors

 Oceania



Tree of Attacks: Jailbreaking Black-Box LLMs Automatically

Neural Information Processing Systems

While Large Language Models (LLMs) display versatile functionality, they continue to generate harmful, biased, and toxic content, as demonstrated by the prevalence of human-designed jailbreaks .