Exploiting Latent Space Discontinuities for Building Universal LLM Jailbreaks and Data Extraction Attacks
Paim, Kayua Oleques, Mansilha, Rodrigo Brandao, Kreutz, Diego, Franco, Muriel Figueredo, Cordeiro, Weverton
–arXiv.org Artificial Intelligence
The rapid proliferation of Large Language Models (LLMs) has raised significant concerns about their security against adversarial attacks. In this work, we propose a novel approach to crafting universal jailbreaks and data extraction attacks by exploiting latent space discontinuities, an architectural vulnerability related to the sparsity of training data. Initial results indicate that when these discontinuities are exploited, they can consistently and profoundly compromise model behavior, even in the presence of layered defenses. The findings suggest that this strategy has substantial potential as a systemic attack vector. Disclaimer: This paper contains examples of harmful and offensive language. Additional supporting materials may be provided upon formal request and are subject to the signing of a liability and ethical use agreement. Large Language Models (LLMs) are enabling novel applications of Artificial Intelligence (AI) and transforming human activities through conversational models (e.g., ChatGPT, DeepSeek, Gemini, Llama, and Claude). LLMs allow for natural human-AI interaction and specialized applications across multiple domains, including image generation (e.g., Adobe Firefly and Pixlr), code automation (e.g., GitHub Copilot and Amazon CodeWhisperer), and retrieval-augmented generation systems (e.g., Perplexity AI and IBM watsonx). The interactions may happen using different interfaces, such as via direct interaction with the user using a Web interface or indirectly via APIs.
arXiv.org Artificial Intelligence
Nov-4-2025
- Country:
- Asia > South Korea (0.04)
- South America > Brazil
- Rio Grande do Sul (0.04)
- São Paulo (0.04)
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Health & Medicine (0.95)
- Information Technology > Security & Privacy (1.00)
- Law (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.94)
- Technology: