🤖 AI Summary
Existing discrete generative models rely on first-order Taylor approximations for guided sampling, introducing significant approximation errors in discrete state spaces. This work proposes the first exact discrete guidance framework, deriving precise transition rates under the target distribution—replacing heuristic approximations with unbiased, efficient sampling guidance. The framework unifies mainstream discrete diffusion and flow-matching methods and natively supports masked diffusion models. Integrating single-step forward propagation, energy-based guidance, and preference alignment, it substantially improves sampling efficiency (2.1× speedup) and generation quality on text-to-image synthesis and multimodal understanding tasks—achieving an 18.7% reduction in FID and a 12.3% increase in CLIP-Score. These results demonstrate both effectiveness and broad generalizability across discrete generative modeling paradigms.
📝 Abstract
Guidance provides a simple and effective framework for posterior sampling by steering the generation process towards the desired distribution. When modeling discrete data, existing approaches mostly focus on guidance with the first-order Taylor approximation to improve the sampling efficiency. However, such an approximation is inappropriate in discrete state spaces since the approximation error could be large. A novel guidance framework for discrete data is proposed to address this problem: We derive the exact transition rate for the desired distribution given a learned discrete flow matching model, leading to guidance that only requires a single forward pass in each sampling step, significantly improving efficiency. This unified novel framework is general enough, encompassing existing guidance methods as special cases, and it can also be seamlessly applied to the masked diffusion model. We demonstrate the effectiveness of our proposed guidance on energy-guided simulations and preference alignment on text-to-image generation and multimodal understanding tasks. The code is available through https://github.com/WanZhengyan/Discrete-Guidance-Matching/tree/main.