Fight Back Against Jailbreaking via Prompt Adversarial Tuning Yichuan Mo1Y uji Wang 2 Zeming Wei 3 Yisen Wang 1,4 1
–Neural Information Processing Systems
While Large Language Models (LLMs) have achieved tremendous success in various applications, they are also susceptible to jailbreaking attacks.
Neural Information Processing Systems
Oct-10-2025, 06:23:46 GMT
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.67)
- Research Report
- Industry:
- Information Technology > Security & Privacy (1.00)
- Technology: