Variational Learning for Insertion-based Generation

📅 2026-06-01
📈 Citations: 0
Influential: 0
📄 PDF

career value

200K/year
🤖 AI Summary
Existing non-monotonic sequence generation methods struggle to support variable-length outputs and cannot adaptively learn insertion orders. This work proposes a permutation-based variational inference framework that, for the first time, precisely parameterizes insertion trajectories as summations over permutations. By establishing a bijection between insertion trajectories and permutations, the approach jointly models insertion positions, content, and termination timing in a unified manner. The method natively supports variable-length generation and learns optimal insertion strategies directly from data. Evaluated on goal-directed planning and molecular string generation tasks, it significantly improves modeling fidelity and generalization, particularly excelling in data domains lacking fixed directional structure.
📝 Abstract
Non-monotonic sequence generation methods, such as masked diffusion models, provide a flexible alternative to left-to-right autoregressive modeling by allowing tokens to be generated in non-fixed and prescribed orders. Despite their practical advantages, most existing non-monotonic models are order-agnostic and rely on a fixed-length grid, limiting their ability to support variable-length generation and adaptive insertion order. In this work, we introduce a probabilistic framework for learning insertion order in variable-length insertion models. We formalize a bijective correspondence between insertion trajectories and permutations, which enables an exact reparameterization of the data likelihood as a sum over permutations. Building on this result, we propose the Insertion Process (IP), a stochastic generative model that jointly learns where to insert, what to insert, and when to terminate, trained via permutation-based variational inference. Unlike prior fixed-canvas approaches, IP natively supports variable-length generation and learns data-driven preferences over insertion orders. Experiments on goal-conditioned planning and molecular string generation demonstrate that learning insertion order improves both modeling quality and generalization in domains without a canonical left-to-right structure.
Problem

Research questions and friction points this paper is trying to address.

non-monotonic sequence generation
variable-length generation
insertion order
masked diffusion models
autoregressive modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

insertion-based generation
variable-length generation
permutation-based variational inference
non-monotonic sequence modeling
stochastic generative model