A Continuous-Time Markov Chain Framework for Insertion Language Models

📅 2026-06-08

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing insertion-based language models are largely heuristic and lack a unified theoretical foundation. This work addresses this gap by deriving a principled framework from first principles, modeling the noising process of variable-length sequences as a continuous-time Markov chain. The resulting diffusion-based denoising framework unifies prior insertion methods as special cases while substantially enhancing sampling flexibility. Empirical evaluations demonstrate that the proposed approach outperforms both left-to-right autoregressive models and mask-based diffusion models on synthetic planning tasks, achieves comparable performance on standard language modeling benchmarks, and enables more flexible generation strategies through its generalized formulation.

📝 Abstract

Insertion Language Models (ILMs) offer several advantages over left-to-right generation and mask-based generation. However, existing formulations of insertion-based generation have largely been ad-hoc. In this paper, we derive a diffusion-style denoising objective for ILMs from first principles by formulating the noising process as a continuous-time Markov chain on the space of variable-length sequences. We show that previous formulations of ILMs can be viewed as special cases of this denoising framework. Through empirical evaluation on a synthetic planning task, we show that the proposed approach retains the benefits of insertion-based generation over left-to-right generation and masked diffusion models. In language modeling, our diffusion-based approach is competitive with left-to-right generation and masked diffusion models, while offering additional flexibility in sampling compared to existing insertion language models.

Problem

Research questions and friction points this paper is trying to address.

Insertion Language Models

continuous-time Markov chain

denoising objective

variable-length sequences

language generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Insertion Language Models

Continuous-Time Markov Chain

Diffusion Models