Exploiting Latent Space Discontinuities for Building Universal LLM Jailbreaks and Data Extraction Attacks

Paim, Kayua Oleques, Mansilha, Rodrigo Brandao, Kreutz, Diego, Franco, Muriel Figueredo, Cordeiro, Weverton

Nov-4-2025–arXiv.org Artificial Intelligence

The rapid proliferation of Large Language Models (LLMs) has raised significant concerns about their security against adversarial attacks. In this work, we propose a novel approach to crafting universal jailbreaks and data extraction attacks by exploiting latent space discontinuities, an architectural vulnerability related to the sparsity of training data. Initial results indicate that when these discontinuities are exploited, they can consistently and profoundly compromise model behavior, even in the presence of layered defenses. The findings suggest that this strategy has substantial potential as a systemic attack vector. Disclaimer: This paper contains examples of harmful and offensive language. Additional supporting materials may be provided upon formal request and are subject to the signing of a liability and ethical use agreement. Large Language Models (LLMs) are enabling novel applications of Artificial Intelligence (AI) and transforming human activities through conversational models (e.g., ChatGPT, DeepSeek, Gemini, Llama, and Claude). LLMs allow for natural human-AI interaction and specialized applications across multiple domains, including image generation (e.g., Adobe Firefly and Pixlr), code automation (e.g., GitHub Copilot and Amazon CodeWhisperer), and retrieval-augmented generation systems (e.g., Perplexity AI and IBM watsonx). The interactions may happen using different interfaces, such as via direct interaction with the user using a Web interface or indirectly via APIs.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Nov-4-2025

arXiv.org PDF

Add feedback

Country:
- South America > Brazil (0.28)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (0.95)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.94)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found