Tree of Attacks: Jailbreaking Black-Box LLMs Automatically
–Neural Information Processing Systems
While Large Language Models (LLMs) display versatile functionality, they continue to generate harmful, biased, and toxic content, as demonstrated by the prevalence of human-designed jailbreaks .
Neural Information Processing Systems
Oct-10-2025, 05:44:11 GMT
- Country:
- Genre:
- Research Report
- New Finding (1.00)
- Experimental Study (1.00)
- Research Report
- Industry:
- Transportation (1.00)
- Media > News (1.00)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Government > Military (0.67)
- Health & Medicine > Therapeutic Area
- Psychiatry/Psychology (0.46)
- Technology: