Tree of Attacks: Jailbreaking Black-Box LLMs Automatically

Neural Information Processing Systems 

While Large Language Models (LLMs) display versatile functionality, they continue to generate harmful, biased, and toxic content, as demonstrated by the prevalence of human-designed jailbreaks .

Similar Docs  Excel Report  more

TitleSimilaritySource
None found