Customization under Fire: Plugin Poisoning in Text-to-Image Ecosystem

📅 2026-06-08

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses a novel supply chain security threat emerging from the widespread sharing of LoRA adapters in text-to-image (T2I) model ecosystems. It formalizes this attack surface for the first time and introduces PoisonLoRA, a framework that stealthily embeds persistent, highly transmissible malicious payloads through concept hijacking and task injection, operating undetected by end users. Leveraging collaborative mechanisms such as model merging, the attack achieves viral propagation across models and platforms. Empirical evaluation across six datasets from Civitai and Liblib, spanning four distinct scenarios, demonstrates near-perfect attack success rates. The malicious payloads remain effective even after cross-base-model transfer and survive more than five rounds of adapter fusion, while evading current platform-level detection systems—significantly surpassing the limitations of conventional backdoor attacks.

📝 Abstract

The prosperity of text-to-image (T2I) models has fostered a vibrant share-and-play ecosystem centered on Low-Rank Adaptation (LoRA) plugins, which allow users to customize and share model capabilities with ease. This democratization, however, comes with a hidden but severe security risk. Malicious users could share and distribute seemingly benign LoRA plugins that contain hidden functionalities to poison the model-sharing market, like Civitai or Liblib, severely undermining the user trust that underpins this collaborative ecosystem and threatening the safety of countless downstream applications. Despite these risks, plugin poisoning in the real-world T2I ecosystem remains underexplored. This paper introduces PoisonLoRA, the first systematic study of LoRA plugin supply-chain risks that exploits the trust and characteristics within the T2I ecosystem. We identify two primary attack instances: (1) Concept Hijacking, where a hijacked LoRA could generate images to influence public opinion and spread propaganda, and (2) Task Injection, where a LoRA is injected to produce harmful content (e.g., NSFW images) only activated by a secret key. Critically, the malicious payload persists with virus-like propagation. Such propagations weaponize the very act of creative collaboration (e.g., LoRA merging) to spread its contagion, turning every remix into a new carrier. Extensive experiments validate that PoisonLoRA is both effective and stealthy. Specifically, we achieve approximately 100% attack success rates (ASR) on both Civitai and Liblib on 6 datasets across 4 scenarios, without being detected by the platforms. The poisoned LoRA demonstrates extreme robustness, with nearly 100% ASR even transferred to different base models and remixed more than 5 times.

Problem

Research questions and friction points this paper is trying to address.

plugin poisoning

text-to-image

LoRA

supply-chain attack

model security

Innovation

Methods, ideas, or system contributions that make the work stand out.

LoRA poisoning

text-to-image security

concept hijacking