Can Transformer Memory Be Corrupted? Investigating Cache-Side Vulnerabilities in Large Language Models

Hossain, Elias, Saha, Swayamjit, Roy, Somshubhra, Prasad, Ravi

Oct-21-2025–arXiv.org Artificial Intelligence

Even when prompts and parameters are secured, transformer language models remain vulnerable because their key-value (KV) cache during inference constitutes an overlooked attack surface. This paper introduces Malicious Token Injection (MTI), a modular framework that systematically perturbs cached key vectors at selected layers and timesteps through controlled magnitude and frequency, using additive Gaussian noise, zeroing, and orthogonal rotations. A theoretical analysis quantifies how these perturbations propagate through attention, linking logit deviations to the Frobenius norm of corruption and softmax Lipschitz dynamics. Empirical results show that MTI significantly alters next-token distributions and downstream task performance across GPT-2 and LLaMA-2/7B, as well as destabilizes retrieval-augmented and agentic reasoning pipelines. These findings identify cache integrity as a critical yet underexplored vulnerability in current LLM deployments, positioning cache corruption as a reproducible and theoretically grounded threat model for future robustness and security research.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Oct-21-2025

arXiv.org PDF

Add feedback

Country:
- North America
  - Canada (0.04)
  - United States
    - California > San Diego County
      - San Diego (0.04)
    - Florida > Orange County
      - Orlando (0.14)
    - Mississippi > Mississippi County
      - Mississippi State (0.04)
    - North Carolina > Wake County
      - Raleigh (0.04)
    - Texas > Travis County
      - Austin (0.04)
    - Washington > King County
      - Seattle (0.04)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Information Technology > Security & Privacy (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language
    - Chatbot (1.00)
    - Large Language Model (1.00)