🤖 AI Summary
Discrete diffusion models are gaining prominence in natural language and graph learning tasks, yet their sampling efficiency is hindered by insufficient theoretical guarantees for discretization schemes such as τ-leaping: existing analyses rely on strong, hard-to-verify regularity assumptions and yield KL divergence convergence bounds scaling quadratically with vocabulary size |𝒱|. This work introduces the first unified analysis framework for discrete diffusion sampling that requires no strong regularity assumptions. Leveraging differential inequality techniques—replacing the conventional Girsanov-based approach—we derive the first |𝒱|-linear KL convergence bounds for widely used samplers, including Euler discretization and Tweedie τ-leaping. Our results substantially enhance both theoretical applicability and practical relevance, delivering the tightest and most general KL divergence convergence guarantees established to date for discrete diffusion models.
📝 Abstract
Discrete diffusion models have recently gained significant prominence in applications involving natural language and graph data. A key factor influencing their effectiveness is the efficiency of discretized samplers. Among these, $τ$-leaping samplers have become particularly popular due to their empirical success. However, existing theoretical analyses of $τ$-leaping often rely on somewhat restrictive and difficult-to-verify regularity assumptions, and their convergence bounds contain quadratic dependence on the vocabulary size. In this work, we introduce a new analytical approach for discrete diffusion models that removes the need for such assumptions. For the standard $τ$-leaping method, we establish convergence guarantees in KL divergence that scale linearly with vocabulary size, improving upon prior results with quadratic dependence. Our approach is also more broadly applicable: it provides the first convergence guarantees for other widely used samplers, including the Euler method and Tweedie $τ$-leaping. Central to our approach is a novel technique based on differential inequalities, offering a more flexible alternative to the traditional Girsanov change-of-measure methods. This technique may also be of independent interest for the analysis of other stochastic processes.