๐ค AI Summary
Quantizing diffusion models is challenging due to their computationally intensive iterative denoising process, especially under resource constraints; while post-training quantization (PTQ) accelerates inference, quantization errors accumulate progressively across timesteps, severely degrading generation quality. This work presents the first closed-form characterization of cumulative quantization error in diffusion processes and establishes a timestep-aware error propagation model. Based on this analysis, we propose a dynamic compensation mechanism that introduces a timestep-dependent error correction term at each denoising stepโrequiring no retraining and fully compatible with standard PTQ pipelines. Evaluated across multiple image datasets, our method significantly mitigates error accumulation: under INT4 and INT8 precision, it substantially outperforms existing PTQ approaches in generation fidelity, achieving state-of-the-art performance.
๐ Abstract
Diffusion models have transformed image synthesis by establishing unprecedented quality and creativity benchmarks. Nevertheless, their large-scale deployment faces challenges due to computationally intensive iterative denoising processes. Although post-training quantization(PTQ) provides an effective pathway for accelerating sampling, the iterative nature of diffusion models causes stepwise quantization errors to accumulate progressively during generation, inevitably compromising output fidelity. To address this challenge, we develop a theoretical framework that mathematically formulates error propagation in Diffusion Models (DMs), deriving per-step quantization error propagation equations and establishing the first closed-form solution for cumulative error. Building on this theoretical foundation, we propose a timestep-aware cumulative error compensation scheme. Extensive experiments across multiple image datasets demonstrate that our compensation strategy effectively mitigates error propagation, significantly enhancing existing PTQ methods to achieve state-of-the-art(SOTA) performance on low-precision diffusion models.