🤖 AI Summary
Diffusion models pose high theoretical barriers, particularly for researchers without backgrounds in statistical physics, hindering broad accessibility and deep understanding.
Method: This paper introduces a unified variational perspective grounded in directed graphical models and variational inference—fully decoupling diffusion modeling from non-equilibrium thermodynamics and stochastic differential equations. Instead, it builds upon VAE principles, hierarchical latent-variable modeling, and ELBO decomposition to reformulate the probabilistic generative logic of diffusion processes.
Contribution/Results: We present the first purely probabilistic graphical formulation of diffusion models, substantially lowering prerequisite knowledge requirements. The framework yields a self-consistent, interpretable, and pedagogically accessible theoretical narrative. Moreover, it establishes a general analytical paradigm for generative modeling, enabling principled design of novel architectures and facilitating cross-domain applications.
📝 Abstract
Despite the growing interest in diffusion models, gaining a deep understanding of the model class remains an elusive endeavour, particularly for the uninitiated in non-equilibrium statistical physics. Thanks to the rapid rate of progress in the field, most existing work on diffusion models focuses on either applications or theoretical contributions. Unfortunately, the theoretical material is often inaccessible to practitioners and new researchers, leading to a risk of superficial understanding in ongoing research. Given that diffusion models are now an indispensable tool, a clear and consolidating perspective on the model class is needed to properly contextualize recent advances in generative modelling and lower the barrier to entry for new researchers. To that end, we revisit predecessors to diffusion models like hierarchical latent variable models and synthesize a holistic perspective using only directed graphical modelling and variational inference principles. The resulting narrative is easier to follow as it imposes fewer prerequisites on the average reader relative to the view from non-equilibrium thermodynamics or stochastic differential equations.