🤖 AI Summary
Existing discrete sequence generation models rely on continuous embeddings and variational lower-bound (ELBO) approximations, leading to substantial bias in negative log-likelihood (NLL) estimation and slow convergence. This work introduces the first purely discrete-domain Poisson diffusion model, eliminating continuous embeddings and ELBO approximations entirely. Grounded in rigorous information-theoretic analysis, we derive an exact equivalence between the proposed Poisson reconstruction loss (PRL) and the true NLL. The Poisson diffusion process is formulated as a discrete-time Markov chain, and we design dedicated architectures for symbolic music (Lakh MIDI) and image token generation (CIFAR-10). Experiments demonstrate up to an 80% reduction in NLL and significantly accelerated training convergence. These results validate both the theoretical soundness and practical efficiency of discrete-domain diffusion modeling.
📝 Abstract
Existing methods for generative modeling of discrete data, such as symbolic music tokens, face two primary challenges: (1) they either embed discrete inputs into continuous state-spaces or (2) rely on variational losses that only approximate the true negative log-likelihood. Previous efforts have individually targeted these limitations. While information-theoretic Gaussian diffusion models alleviate the suboptimality of variational losses, they still perform modeling in continuous domains. In this work, we introduce the Information-Theoretic Discrete Poisson Diffusion Model (ItDPDM), which simultaneously addresses both limitations by directly operating in a discrete state-space via a Poisson diffusion process inspired by photon arrival processes in camera sensors. We introduce a novel Poisson Reconstruction Loss (PRL) and derive an exact relationship between PRL and the true negative log-likelihood, thereby eliminating the need for approximate evidence lower bounds. Experiments conducted on the Lakh MIDI symbolic music dataset and the CIFAR-10 image benchmark demonstrate that ItDPDM delivers significant improvements, reducing test NLL by up to 80% compared to prior baselines, while also achieving faster convergence.