🤖 AI Summary
Existing diffusion-based language models struggle to effectively leverage global bidirectional context during decoding, resulting in inefficient trajectory planning. This work proposes Plan-Verify-Fill (PVF), a novel hierarchical planning framework that, for the first time, integrates quantitative verification into the decoding process: it first constructs a semantic skeleton, then validates its structural plausibility, and finally fills in details in parallel. By combining semantic anchor prioritization, a structured verification protocol, and the bidirectional contextual capabilities of diffusion models, PVF enables training-free structured parallel decoding with a dynamic stopping strategy. Evaluated on LLaDA-8B-Instruct and Dream-7B-Instruct, PVF reduces function evaluations by up to 65% compared to confidence-based parallel decoding, substantially improving decoding efficiency while preserving generation quality.
📝 Abstract
Diffusion Language Models (DLMs) present a promising non-sequential paradigm for text generation, distinct from standard autoregressive (AR) approaches. However, current decoding strategies often adopt a reactive stance, underutilizing the global bidirectional context to dictate global trajectories. To address this, we propose Plan-Verify-Fill (PVF), a training-free paradigm that grounds planning via quantitative validation. PVF actively constructs a hierarchical skeleton by prioritizing high-leverage semantic anchors and employs a verification protocol to operationalize pragmatic structural stopping where further deliberation yields diminishing returns. Extensive evaluations on LLaDA-8B-Instruct and Dream-7B-Instruct demonstrate that PVF reduces the Number of Function Evaluations (NFE) by up to 65% compared to confidence-based parallel decoding across benchmark datasets, unlocking superior efficiency without compromising accuracy.