🤖 AI Summary
Diffusion models suffer from weak controllability during sampling, making it difficult to satisfy statistical constraints—primarily because the relationship between initial noise perturbations and final outputs remains uncharacterized. This work provides the first theoretical proof that, under diffusion ODE sampling, the output exhibits a strongly linear response to initial noise perturbations. Leveraging this insight, we propose CCS (Controlled Controllable Sampling), the first explicit noise-space controllable sampling framework. CCS employs differentiable controllers in the noise space to precisely specify target statistics—e.g., mean and variance—without altering the model architecture or requiring retraining. Evaluated across multiple benchmarks, CCS achieves state-of-the-art statistical controllability while preserving high sample quality (FID ≤ 2.1) and diversity (LPIPS ≥ 0.43).
📝 Abstract
Diffusion models have emerged as powerful tools for generative tasks, producing high-quality outputs across diverse domains. However, how the generated data responds to the initial noise perturbation in diffusion models remains under-explored, which hinders understanding the controllability of the sampling process. In this work, we first observe an interesting phenomenon: the relationship between the change of generation outputs and the scale of initial noise perturbation is highly linear through the diffusion ODE sampling. Then we provide both theoretical and empirical study to justify this linearity property of this input-output (noise-generation data) relationship. Inspired by these new insights, we propose a novel Controllable and Constrained Sampling method (CCS) together with a new controller algorithm for diffusion models to sample with desired statistical properties while preserving good sample quality. We perform extensive experiments to compare our proposed sampling approach with other methods on both sampling controllability and sampled data quality. Results show that our CCS method achieves more precisely controlled sampling while maintaining superior sample quality and diversity.