Test-Time Alignment of Discrete Diffusion Models with Sequential Monte Carlo

๐Ÿ“… 2025-05-28
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Discrete diffusion models face a challenge during inference: they must satisfy external reward constraints without permitting model fine-tuning. Method: We propose a zero-shot, training-free sampling framework thatโ€” for the first timeโ€”couples Twisted Sequential Monte Carlo (Twisted SMC) with Gumbel-Softmax relaxation, and employs first-order Taylor approximation to efficiently estimate reward gradients. This enables differentiable, low-variance, high-fidelity importance reweighting in the discrete latent space. Contribution/Results: The framework requires no model modification or auxiliary training, yet significantly improves constraint satisfaction rates and generation quality across synthetic data and image modeling tasks. It establishes a new plug-and-play paradigm for controllable generation with discrete diffusion models.

Technology Category

Application Category

๐Ÿ“ Abstract
Discrete diffusion models have become highly effective across various domains. However, real-world applications often require the generative process to adhere to certain constraints but without task-specific fine-tuning. To this end, we propose a training-free method based on Sequential Monte Carlo (SMC) to sample from the reward-aligned target distribution at the test time. Our approach leverages twisted SMC with an approximate locally optimal proposal, obtained via a first-order Taylor expansion of the reward function. To address the challenge of ill-defined gradients in discrete spaces, we incorporate a Gumbel-Softmax relaxation, enabling efficient gradient-based approximation within the discrete generative framework. Empirical results on both synthetic datasets and image modelling validate the effectiveness of our approach.
Problem

Research questions and friction points this paper is trying to address.

Aligning discrete diffusion models with constraints without fine-tuning
Sampling reward-aligned distributions using Sequential Monte Carlo
Handling ill-defined gradients in discrete spaces via Gumbel-Softmax
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sequential Monte Carlo for test-time alignment
Gumbel-Softmax relaxation for discrete gradients
First-order Taylor expansion for reward approximation
๐Ÿ”Ž Similar Papers
No similar papers found.