๐ค AI Summary
Discrete diffusion models face a challenge during inference: they must satisfy external reward constraints without permitting model fine-tuning. Method: We propose a zero-shot, training-free sampling framework thatโ for the first timeโcouples Twisted Sequential Monte Carlo (Twisted SMC) with Gumbel-Softmax relaxation, and employs first-order Taylor approximation to efficiently estimate reward gradients. This enables differentiable, low-variance, high-fidelity importance reweighting in the discrete latent space. Contribution/Results: The framework requires no model modification or auxiliary training, yet significantly improves constraint satisfaction rates and generation quality across synthetic data and image modeling tasks. It establishes a new plug-and-play paradigm for controllable generation with discrete diffusion models.
๐ Abstract
Discrete diffusion models have become highly effective across various domains. However, real-world applications often require the generative process to adhere to certain constraints but without task-specific fine-tuning. To this end, we propose a training-free method based on Sequential Monte Carlo (SMC) to sample from the reward-aligned target distribution at the test time. Our approach leverages twisted SMC with an approximate locally optimal proposal, obtained via a first-order Taylor expansion of the reward function. To address the challenge of ill-defined gradients in discrete spaces, we incorporate a Gumbel-Softmax relaxation, enabling efficient gradient-based approximation within the discrete generative framework. Empirical results on both synthetic datasets and image modelling validate the effectiveness of our approach.