Can Transformer Memory Be Corrupted? Investigating Cache-Side Vulnerabilities in Large Language Models

📅 2025-10-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work identifies the key-value (KV) cache—a previously overlooked attack surface in large language model (LLM) inference—as a critical security vulnerability: even when prompts and model parameters are protected, adversaries can systematically perturb cached key vectors via malicious token injection (MTI), thereby distorting next-token prediction distributions and degrading downstream task performance. The authors formally establish KV cache integrity as a fundamental security dimension and propose a modular cache perturbation framework enabling tunable perturbation strength and layer- or timestep-specific targeting. Leveraging Frobenius norm constraints and softmax Lipschitz continuity analysis, they develop a theoretical model characterizing perturbation propagation through the attention mechanism. Experiments on GPT-2 and LLaMA-2/7B demonstrate that MTI significantly impairs retrieval-augmented generation and agent-based reasoning performance. This work introduces a novel threat paradigm for LLMs and establishes foundational benchmarks for cache-aware security defenses.

Technology Category

Application Category

📝 Abstract
Even when prompts and parameters are secured, transformer language models remain vulnerable because their key-value (KV) cache during inference constitutes an overlooked attack surface. This paper introduces Malicious Token Injection (MTI), a modular framework that systematically perturbs cached key vectors at selected layers and timesteps through controlled magnitude and frequency, using additive Gaussian noise, zeroing, and orthogonal rotations. A theoretical analysis quantifies how these perturbations propagate through attention, linking logit deviations to the Frobenius norm of corruption and softmax Lipschitz dynamics. Empirical results show that MTI significantly alters next-token distributions and downstream task performance across GPT-2 and LLaMA-2/7B, as well as destabilizes retrieval-augmented and agentic reasoning pipelines. These findings identify cache integrity as a critical yet underexplored vulnerability in current LLM deployments, positioning cache corruption as a reproducible and theoretically grounded threat model for future robustness and security research.
Problem

Research questions and friction points this paper is trying to address.

Transformer KV cache vulnerabilities enable inference-time attacks
Malicious Token Injection framework corrupts cached key vectors systematically
Cache perturbations alter model outputs and compromise reasoning pipelines
Innovation

Methods, ideas, or system contributions that make the work stand out.

MTI framework perturbs KV cache via controlled noise injection
Theoretical analysis links logit deviations to corruption metrics
Cache corruption destabilizes reasoning pipelines across multiple models
🔎 Similar Papers
No similar papers found.
Elias Hossain
Elias Hossain
PhD Student, University of Central Florida, USA
(Deep) Machine LearningTrustworthy AILLM ReasoningBioinformatics
Swayamjit Saha
Swayamjit Saha
Mississippi State University
Deep LearningArtificial IntelligenceRoboticsCybersecurity
S
Somshubhra Roy
Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC 27695-7911, United States
R
Ravi Prasad
Department of Computer Science and Engineering, Mississippi State University, Mississippi State, MS 39762, United States