🤖 AI Summary
Text-to-image (T2I) models exhibit underexplored “brand bias”—an overrepresentation of dominant commercial brands in outputs generated from generic prompts—posing ethical and legal risks. To address this, we propose a **model-agnostic, inference-time debiasing framework** that requires no model retraining. First, we introduce the Brand Neutrality Score (BNS) to quantify brand bias. Then, we integrate a lightweight brand detector with a vision-language model (VLM) and employ causal-driven prompt optimization to dynamically refine input prompts during inference, yielding stylistically diverse and brand-neutral images. Experiments across multiple state-of-the-art T2I models demonstrate significant reductions in both explicit and implicit brand bias, while preserving image fidelity and aesthetic quality. Our approach establishes a new paradigm for controllable, legally compliant content generation.
📝 Abstract
Text-to-image (T2I) models exhibit a significant yet under-explored "brand bias", a tendency to generate contents featuring dominant commercial brands from generic prompts, posing ethical and legal risks. We propose CIDER, a novel, model-agnostic framework to mitigate bias at inference-time through prompt refinement to avoid costly retraining. CIDER uses a lightweight detector to identify branded content and a Vision-Language Model (VLM) to generate stylistically divergent alternatives. We introduce the Brand Neutrality Score (BNS) to quantify this issue and perform extensive experiments on leading T2I models. Results show CIDER significantly reduces both explicit and implicit biases while maintaining image quality and aesthetic appeal. Our work offers a practical solution for more original and equitable content, contributing to the development of trustworthy generative AI.