🤖 AI Summary
This paper systematically investigates how loss function selection in diffusion models affects downstream tasks such as sample quality and likelihood estimation. Methodologically, it establishes a unified variational lower bound (ELBO) framework to theoretically derive and compare the optimization objectives, bias sources, and asymptotic properties of prevalent losses—including weighted L2, denoising score matching, importance-weighted, and reconstruction objectives. Combining Monte Carlo estimation with large-scale empirical experiments, it reveals fundamental trade-offs among sample fidelity, distribution coverage, and log-likelihood estimation accuracy. The key contribution is the first principled analysis—grounded in variational objective consistency—that characterizes both the equivalence conditions and performance divergence boundaries across these losses. This yields an interpretable theoretical foundation and empirically validated guidance for task-aware loss design in diffusion modeling.
📝 Abstract
Diffusion models have emerged as powerful generative models, inspiring extensive research into their underlying mechanisms. One of the key questions in this area is the loss functions these models shall train with. Multiple formulations have been introduced in the literature over the past several years with some links and some critical differences stemming from various initial considerations. In this paper, we explore the different target objectives and corresponding loss functions in detail. We present a systematic overview of their relationships, unifying them under the framework of the variational lower bound objective. We complement this theoretical analysis with an empirical study providing insights into the conditions under which these objectives diverge in performance and the underlying factors contributing to such deviations. Additionally, we evaluate how the choice of objective impacts the model ability to achieve specific goals, such as generating high-quality samples or accurately estimating likelihoods. This study offers a unified understanding of loss functions in diffusion models, contributing to more efficient and goal-oriented model designs in future research.