🤖 AI Summary
Text prompts in diffusion models encode high-value knowledge but are vulnerable to prompt stealing. We identify two fundamental limitations: (1) existing numerical-optimization-based prompt recovery methods ignore the initial noise used during image generation, and (2) mainstream frameworks—including PyTorch—exhibit a cryptographic weakness (CWE-339) due to CPU random seed space restriction (2³²), enabling deterministic noise reconstruction. Method: We propose SeedSnitch, an efficient brute-force seed recovery tool, and PromptPirate, a novel attack framework integrating genetic optimization with seed recovery to enable end-to-end prompt extraction. Contribution/Results: Evaluated on CivitAI data, PromptPirate recovers the correct seed for 95% of shared images within 140 minutes and achieves 8–11% higher LPIPS similarity than baseline methods. Our findings have directly motivated security patches in PyTorch and other major deep learning frameworks.
📝 Abstract
Diffusion models have significantly advanced text-to-image generation, enabling the creation of highly realistic images conditioned on textual prompts and seeds. Given the considerable intellectual and economic value embedded in such prompts, prompt theft poses a critical security and privacy concern. In this paper, we investigate prompt-stealing attacks targeting diffusion models. We reveal that numerical optimization-based prompt recovery methods are fundamentally limited as they do not account for the initial random noise used during image generation. We identify and exploit a noise-generation vulnerability (CWE-339), prevalent in major image-generation frameworks, originating from PyTorch's restriction of seed values to a range of $2^{32}$ when generating the initial random noise on CPUs. Through a large-scale empirical analysis conducted on images shared via the popular platform CivitAI, we demonstrate that approximately 95% of these images' seed values can be effectively brute-forced in 140 minutes per seed using our seed-recovery tool, SeedSnitch. Leveraging the recovered seed, we propose PromptPirate, a genetic algorithm-based optimization method explicitly designed for prompt stealing. PromptPirate surpasses state-of-the-art methods, i.e., PromptStealer, P2HP, and CLIP-Interrogator, achieving an 8-11% improvement in LPIPS similarity. Furthermore, we introduce straightforward and effective countermeasures that render seed stealing, and thus optimization-based prompt stealing, ineffective. We have disclosed our findings responsibly and initiated coordinated mitigation efforts with the developers to address this critical vulnerability.