🤖 AI Summary
Diffusion models generate high-fidelity images but lack explicit control over perceptual quality, leading to unstable generation. To address this, we propose IQA-Adapter—a novel framework that implicitly injects knowledge from no-reference image quality assessment (IQA) models into the diffusion process, enabling quality-controllable generation. Specifically, it supports both no-reference quality-conditioned sampling and reference-image-driven, content-agnostic quality feature transfer. Our method integrates three key components: IQA activation space mapping, gradient-guided optimization, and lightweight adapter-based fine-tuning. Extensive experiments demonstrate that under high-quality conditions, IQA-Adapter improves objective quality metrics by up to 10%, significantly enhances human preference in user studies, and preserves both sample diversity and content fidelity—without compromising generation speed or model architecture. This work establishes the first effective integration of IQA priors into diffusion-based synthesis, bridging the gap between perceptual quality modeling and generative control.
📝 Abstract
Diffusion-based models have recently revolutionized image generation, achieving unprecedented levels of fidelity. However, consistent generation of high-quality images remains challenging partly due to the lack of conditioning mechanisms for perceptual quality. In this work, we propose methods to integrate image quality assessment (IQA) models into diffusion-based generators, enabling quality-aware image generation. We show that diffusion models can learn complex qualitative relationships from both IQA models' outputs and internal activations. First, we experiment with gradient-based guidance to optimize image quality directly and show this method has limited generalizability. To address this, we introduce IQA-Adapter, a novel framework that conditions generation on target quality levels by learning the implicit relationship between images and quality scores. When conditioned on high target quality, IQA-Adapter can shift the distribution of generated images towards a higher-quality subdomain, and, inversely, it can be used as a degradation model, generating progressively more distorted images when provided with a lower-quality signal. Under high-quality condition, IQA-Adapter achieves up to a 10% improvement across multiple objective metrics, as confirmed by a user preference study, while preserving generative diversity and content. Furthermore, we extend IQA-Adapter to a reference-based conditioning scenario, utilizing the rich activation space of IQA models to transfer highly specific, content-agnostic qualitative features between images.