Strengthening Polymorphic Prompt Assembling: Dynamic Separator Generation Against Emerging Prompt Injection Attacks

πŸ“… 2026-05-28
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

178K/year
πŸ€– AI Summary
This work addresses the vulnerability of static delimiters in multi-turn interactions, which are prone to reuse and thereby expand the β€œblast radius” of prompt injection attacks. To mitigate this, the authors propose a dynamic, single-use delimiter mechanism that generates request-unique prompt boundary markers by hashing a combination of timestamp, session ID, and a cryptographic nonce using SHA-256. This approach strictly confines the impact of delimiter leakage to individual requests and leverages domain-separated hashing with context-binding architecture, eliminating the need for model fine-tuning. Experimental results on Llama-3.3-70B and DeepSeek-V4-Flash demonstrate a reduction in typical attack success rates from 0.88 to 0.38 and completely eliminate risks associated with format-breaking attacks, while introducing only a minimal overhead of 2.7 microseconds per request.
πŸ“ Abstract
Polymorphic Prompt Assembling (PPA) defends LLM agents against prompt injections by randomly selecting separator pairs from a fixed pool to isolate user input from system instructions. Although effective, static pool reuse exposes a blast-radius vulnerability: once a separator leaks, it can be exploited in future requests. We propose a dynamic per-request separator generation using domain-separated SHA-256 digests keyed on the timestamp, session identifier, and cryptographic nonce. Each assembled prompt receives a unique (BEGIN, END) canary pair, thereby limiting leakage exposure to a single request. We evaluated our extension against 16 injection payloads on Llama-3.3-70B-Instruct-Turbo, with cross-model validation on DeepSeek-V4-Flash model. Against the M1 obfuscation payload (leetspeak + urgency), the dynamic mode reduces the Attack Success Rate (ASR) from 0.88 to 0.38, yielding a statistically significant 2.3 x mitigation verified by non-overlapping 95% Wilson confidence intervals. Against format_breakout_salad, static separator leakage (leak_rate = 0.467) is eliminated entirely in the dynamic mode (0.000), confirming the blast-radius reduction in practice. The implementation requires no model fine-tuning, adds 2.7 microseconds prompt-assembly overhead per request, and is backward compatible with the existing PPA SDK.
Problem

Research questions and friction points this paper is trying to address.

Prompt Injection
Polymorphic Prompt Assembling
Blast-radius Vulnerability
Separator Leakage
LLM Security
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Separator Generation
Prompt Injection Defense
Polymorphic Prompt Assembling
Blast-Radius Mitigation
LLM Security
πŸ’Ό Related Jobs