π€ AI Summary
Text-to-image generation models often exhibit unfair representations of socially sensitive attributes (e.g., gender, race) due to biases inherent in training data, posing significant ethical risks. To address this, we propose the first multimodal chain-of-thought (CoT) reasoning framework tailored for fairness in text-to-image synthesis. Our method dynamically constrains the generation process in a zero-shot setting via iterative prompt refinement and real-time semantic calibration. Innovatively, it integrates multimodal large language models with cross-model adaptation interfaces, enabling plug-and-play fairness enhancement for DALLΒ·E and multiple Stable Diffusion variants. Experiments demonstrate a 32.7% improvement in balanced representation rate, with negligible degradation in generation quality: FID increases by less than 0.8, and CLIP Score variation remains under 1.2%. Thus, our approach significantly enhances fairness while preserving both fidelity and semantic alignment.
π Abstract
In the domain of text-to-image generative models, biases inherent in training datasets often propagate into generated content, posing significant ethical challenges, particularly in socially sensitive contexts. We introduce FairCoT, a novel framework that enhances fairness in text to image models through Chain of Thought (CoT) reasoning within multimodal generative large language models. FairCoT employs iterative CoT refinement to systematically mitigate biases, and dynamically adjusts textual prompts in real time, ensuring diverse and equitable representation in generated images. By integrating iterative reasoning processes, FairCoT addresses the limitations of zero shot CoT in sensitive scenarios, balancing creativity with ethical responsibility. Experimental evaluations across popular text-to-image systems including DALLE and various Stable Diffusion variants, demonstrate that FairCoT significantly enhances fairness and diversity without sacrificing image quality or semantic fidelity. By combining robust reasoning, lightweight deployment, and extensibility to multiple models, FairCoT represents a promising step toward more socially responsible and transparent AI driven content generation.