Debiasing Guidance for Discrete Diffusion with Sequential Monte Carlo

📅 2025-02-10

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

Existing guided sampling methods for discrete diffusion models suffer from asymptotic bias, preventing unbiased sampling from the target distribution $p_0(x_0)p(zeta|x_0)^alpha$. This work introduces sequential Monte Carlo (SMC) into discrete diffusion guidance for the first time, proposing a differentiable SMC sampler that integrates a learned unconditional diffusion process, an explicit guided transition kernel, and importance resampling. Crucially, the method provides theoretical guarantees of unbiasedness—overcoming the fundamental bias limitations of conventional classifier- and classifier-free guidance. Experiments demonstrate substantial improvements in conditional control fidelity across both image and text generation tasks. Notably, in text generation, the approach achieves superior guidance performance while maintaining low perplexity—outperforming existing guided sampling strategies.

Technology Category

Application Category

📝 Abstract

Discrete diffusion models are a class of generative models that produce samples from an approximated data distribution within a discrete state space. Often, there is a need to target specific regions of the data distribution. Current guidance methods aim to sample from a distribution with mass proportional to $p_0(x_0) p(zeta|x_0)^alpha$ but fail to achieve this in practice. We introduce a Sequential Monte Carlo algorithm that generates unbiasedly from this target distribution, utilising the learnt unconditional and guided process. We validate our approach on low-dimensional distributions, controlled images and text generations. For text generation, our method provides strong control while maintaining low perplexity compared to guidance-based approaches.

Problem

Research questions and friction points this paper is trying to address.

Debiasing guidance in discrete diffusion models

Targeting specific data distribution regions

Maintaining low perplexity in text generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sequential Monte Carlo algorithm

Unbiased target distribution sampling

Enhanced text generation control

🔎 Similar Papers

Learning Diffusion Priors from Observations by Expectation Maximization