Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs
–Neural Information Processing Systems
To address these issues, we introduced JailTrickBench to evaluate the impact of various attack settings on LLM performance and provide a baseline for jailbreak attacks, encouraging the adoption of a standardized evaluation framework.
Neural Information Processing Systems
Oct-9-2025, 23:32:57 GMT
- Country:
- Africa (0.04)
- Asia > China
- Guangdong Province > Guangzhou (0.04)
- Hong Kong (0.04)
- North America
- Canada > British Columbia
- Vancouver (0.04)
- United States
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Ohio (0.04)
- Louisiana > Orleans Parish
- Canada > British Columbia
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Government (0.93)
- Health & Medicine > Therapeutic Area (0.68)
- Information Technology > Security & Privacy (1.00)
- Law (1.00)
- Law Enforcement & Public Safety (1.00)
- Media (0.67)
- Transportation > Infrastructure & Services (0.67)
- Technology: