Fight Back Against Jailbreaking via Prompt Adversarial Tuning Yichuan Mo1Y uji Wang 2 Zeming Wei 3 Yisen Wang 1,4 1

Oct-10-2025, 06:23:46 GMT–Neural Information Processing Systems

While Large Language Models (LLMs) have achieved tremendous success in various applications, they are also susceptible to jailbreaking attacks.

arxiv, jailbreak attack, language model, (15 more...)

Neural Information Processing Systems

Oct-10-2025, 06:23:46 GMT

Conferences PDF

Country:
- Asia > China
  - Beijing > Beijing (0.04)
  - Zhejiang Province > Hangzhou (0.04)
  - Shanghai > Shanghai (0.04)

Genre:
- Research Report
  - Experimental Study (1.00)
  - New Finding (0.67)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
Fight Back Against Jailbreaking via Prompt Adversarial Tuning Yichuan Mo1Y uji Wang 2 Zeming Wei 3 Yisen Wang 1,4 1

Similar Docs Excel Report more

Title	Similarity	Source
None found