SoK: Prompt Hacking of Large Language Models

Rababah, Baha, Shang, null, Wu, null, Kwiatkowski, Matthew, Leung, Carson, Akcora, Cuneyt Gurcan

Oct-15-2024–arXiv.org Artificial Intelligence

The safety and robustness of large language models (LLMs) based applications remain critical challenges in artificial intelligence. Among the key threats to these applications are prompt hacking attacks, which can significantly undermine the security and reliability of LLM-based systems. In this work, we offer a comprehensive and systematic overview of three distinct types of prompt hacking: jailbreaking, leaking, and injection, addressing the nuances that differentiate them despite their overlapping characteristics. To enhance the evaluation of LLM-based applications, we propose a novel framework that categorizes LLM responses into five distinct classes, moving beyond the traditional binary classification. This approach provides more granular insights into the AI's behavior, improving diagnostic precision and enabling more targeted enhancements to the system's safety and robustness.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Oct-15-2024

arXiv.org PDF

Add feedback

Country:
- South America > Brazil (0.04)
- Oceania > Australia
  - New South Wales > Sydney (0.04)
- North America
  - United States
    - New York > Kings County
      - New York City (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - Hawaii > Honolulu County
      - Honolulu (0.04)
    - California > Los Angeles County
      - Long Beach (0.04)
  - Cuba > Guantánamo Province
    - Guantánamo (0.04)
  - Canada
    - Ontario > Toronto (0.04)
    - Manitoba (0.04)
- Europe > Middle East
  - Malta (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - China > Hong Kong (0.04)
- Africa > Rwanda
  - Kigali > Kigali (0.04)

Genre:
- Overview (1.00)
- Research Report (0.82)

Industry:
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Government (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found