🤖 AI Summary
This work addresses the significant training imbalance across noise levels—parameterized by log-SNR—in diffusion models, which leads to inefficient and unstable optimization. To mitigate this issue, the paper introduces the first variance-aware adaptive weighting strategy that dynamically adjusts training weights based on the loss variance at each noise level, promoting a more balanced optimization process. The proposed method incorporates a log-SNR-dependent loss variance estimator and a dynamic reweighting mechanism. Extensive experiments on CIFAR-10 and CIFAR-100 demonstrate its effectiveness, yielding consistently lower FID scores and substantially reduced performance variance across random seeds. These results are further corroborated by FID evaluations, variance heatmaps, and ablation studies, collectively confirming the method’s superiority and robustness.
📝 Abstract
Diffusion models have recently achieved remarkable success in generative modeling, yet their training dynamics across different noise levels remain highly imbalanced, which can lead to inefficient optimization and unstable learning behavior. In this work, we investigate this imbalance from the perspective of loss variance across log-SNR levels and propose a variance-aware adaptive weighting strategy to address it. The proposed approach dynamically adjusts training weights based on the observed variance distribution, encouraging a more balanced optimization process across noise levels. Extensive experiments on CIFAR-10 and CIFAR-100 demonstrate that the proposed method consistently improves generative performance over standard training schemes, achieving lower Fr\'echet Inception Distance (FID) while also reducing performance variance across random seeds. Additional analysis, including loss-log-SNR visualization, variance heatmaps, and ablation studies, further reveal that the adaptive weighting effectively stabilizes training dynamics. These results highlight the potential of variance-aware training strategies for improving diffusion model optimization.