Strengthening Polymorphic Prompt Assembling: Dynamic Separator Generation Against Emerging Prompt Injection Attacks

📅 2026-05-28

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

This work addresses the vulnerability of static delimiters in multi-turn interactions, which are prone to reuse and thereby expand the “blast radius” of prompt injection attacks. To mitigate this, the authors propose a dynamic, single-use delimiter mechanism that generates request-unique prompt boundary markers by hashing a combination of timestamp, session ID, and a cryptographic nonce using SHA-256. This approach strictly confines the impact of delimiter leakage to individual requests and leverages domain-separated hashing with context-binding architecture, eliminating the need for model fine-tuning. Experimental results on Llama-3.3-70B and DeepSeek-V4-Flash demonstrate a reduction in typical attack success rates from 0.88 to 0.38 and completely eliminate risks associated with format-breaking attacks, while introducing only a minimal overhead of 2.7 microseconds per request.

📝 Abstract

Polymorphic Prompt Assembling (PPA) defends LLM agents against prompt injections by randomly selecting separator pairs from a fixed pool to isolate user input from system instructions. Although effective, static pool reuse exposes a blast-radius vulnerability: once a separator leaks, it can be exploited in future requests. We propose a dynamic per-request separator generation using domain-separated SHA-256 digests keyed on the timestamp, session identifier, and cryptographic nonce. Each assembled prompt receives a unique (BEGIN, END) canary pair, thereby limiting leakage exposure to a single request. We evaluated our extension against 16 injection payloads on Llama-3.3-70B-Instruct-Turbo, with cross-model validation on DeepSeek-V4-Flash model. Against the M1 obfuscation payload (leetspeak + urgency), the dynamic mode reduces the Attack Success Rate (ASR) from 0.88 to 0.38, yielding a statistically significant 2.3 x mitigation verified by non-overlapping 95% Wilson confidence intervals. Against format_breakout_salad, static separator leakage (leak_rate = 0.467) is eliminated entirely in the dynamic mode (0.000), confirming the blast-radius reduction in practice. The implementation requires no model fine-tuning, adds 2.7 microseconds prompt-assembly overhead per request, and is backward compatible with the existing PPA SDK.

Problem

Research questions and friction points this paper is trying to address.

Prompt Injection

Polymorphic Prompt Assembling

Blast-radius Vulnerability

Separator Leakage

LLM Security

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Separator Generation

Prompt Injection Defense

Polymorphic Prompt Assembling