🤖 AI Summary
Diffusion models with fixed Gaussian priors often suffer from insufficient exploration, large discretization errors, and mode collapse—particularly under reverse KL divergence objectives—due to misalignment between the prior’s support and that of the target distribution. To address this, we propose an end-to-end learnable Gaussian Mixture Prior (GMP), which dynamically increases the number of components during training, thereby enhancing expressivity, support adaptability, and mode diversity. Our method integrates variational diffusion modeling, Gaussian Mixture Model (GMM) parameterization, and reverse KL optimization, augmented by a principled component growth strategy. Evaluated across diverse real-world and synthetic benchmarks, the proposed approach significantly improves sample quality without requiring additional target density evaluations. It effectively mitigates mode collapse and reduces sampling error, establishing a novel paradigm for diffusion prior modeling.
📝 Abstract
Diffusion models optimized via variational inference (VI) have emerged as a promising tool for generating samples from unnormalized target densities. These models create samples by simulating a stochastic differential equation, starting from a simple, tractable prior, typically a Gaussian distribution. However, when the support of this prior differs greatly from that of the target distribution, diffusion models often struggle to explore effectively or suffer from large discretization errors. Moreover, learning the prior distribution can lead to mode-collapse, exacerbated by the mode-seeking nature of reverse Kullback-Leibler divergence commonly used in VI. To address these challenges, we propose end-to-end learnable Gaussian mixture priors (GMPs). GMPs offer improved control over exploration, adaptability to target support, and increased expressiveness to counteract mode collapse. We further leverage the structure of mixture models by proposing a strategy to iteratively refine the model by adding mixture components during training. Our experimental results demonstrate significant performance improvements across a diverse range of real-world and synthetic benchmark problems when using GMPs without requiring additional target evaluations.