Send a SCOUT First: Pre-hoc Reasoning for Adaptive Detector Allocation in Prompt-Injection Defense

📅 2026-05-29

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

This work addresses the limitations of existing prompt injection defenses, which rely on a single fixed detector and struggle against heterogeneous attacks. The authors propose SCOUT, a framework that reframes defense as a dynamic detector scheduling problem: it predicts each detector’s reliability and latency for the current request based on historical behavior and adaptively selects the optimal detector or escalates to an LLM arbiter using a tunable security–utility threshold. SCOUT introduces the first proactive, uncertainty-aware dynamic scheduling mechanism and establishes SCOUT-450, a new benchmark that better reflects real-world agent scenarios. Experiments show that, compared to always using GPT-4o as the arbiter, SCOUT reduces attack success rates by 46% and total latency by 40% on SCOUT-450, with only a 5.1-percentage-point drop in benign request throughput, while significantly advancing the security–utility Pareto frontier across three external benchmarks.

📝 Abstract

Prompt-injection detectors are heterogeneous: each is strong on a different slice of attacks, and none is always reliable. Yet existing systems still treat detection as a fixed single-detector pipeline, committing every request to one detector's blind spots. We reframe defense as detector allocation: given a heterogeneous pool, decide per request which detectors to run and whether to escalate to an LLM judge. Our framework SCOUT (Scalable and Controllable Outcome-prediction for Uncertainty-aware Triage) makes this decision dynamic by predicting each detector's per-sample reliability and latency from how it behaved on similar past inputs, and exposes a single safety-utility threshold to the operator (where utility bundles benign-pass rate and wall-clock). To evaluate this setting, we build SCOUT-450, a benchmark that captures the structurally complex, agent-facing injections that older prompt-injection sets under-represent. On SCOUT-450, a safety-oriented operating point reduces attack-success rate by 46% and total wall-clock by 40% relative to an always-on GPT-4o judge, at a 5.1-point benign-utility drop. SCOUT also transfers to three external benchmarks (BIPIA, IPI, and IHEval), improving the safety-utility frontier.

Problem

Research questions and friction points this paper is trying to address.

prompt-injection

detector allocation

heterogeneous detectors

adaptive defense

safety-utility tradeoff

Innovation

Methods, ideas, or system contributions that make the work stand out.

adaptive detector allocation

prompt-injection defense

heterogeneous detectors