Tree of Attacks: Jailbreaking Black-Box LLMs Automatically
–Neural Information Processing Systems
While Large Language Models (LLMs) display versatile functionality, they continue to generate harmful, biased, and toxic content, as demonstrated by the prevalence of human-designed jailbreaks .
Neural Information Processing Systems
Oct-10-2025, 05:44:11 GMT
- Country:
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Industry:
- Government > Military (0.67)
- Health & Medicine > Therapeutic Area
- Psychiatry/Psychology (0.46)
- Information Technology > Security & Privacy (1.00)
- Law (1.00)
- Media > News (1.00)
- Transportation (1.00)
- Technology: