Constrained Adaptive Rejection Sampling

📅 2025-10-02

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the dichotomy in constrained generation—where greedy decoding distorts the model’s true distribution and standard rejection sampling wastes computational resources—this paper proposes AdaptPrune, a distribution-preserving and computationally efficient adaptive rejection sampling method. Its core innovation is the first integration of dynamic pruning into rejection sampling: a trie-based structure dynamically tracks and subtracts probability mass assigned to invalid suffixes in real time, ensuring monotonic improvement in acceptance rate while guaranteeing zero distributional shift. AdaptPrune unifies unconstrained sampling, adaptive rejection, and prefix-aware pruning. Evaluated on highly constrained tasks—including program fuzzing and molecule generation—it reduces the average number of forward passes per valid sample by 2.1× compared to baselines, while maintaining higher output diversity and semantic validity than both greedy decoding and approximate methods.

Technology Category

Application Category

📝 Abstract

Language Models (LMs) are increasingly used in applications where generated outputs must satisfy strict semantic or syntactic constraints. Existing approaches to constrained generation fall along a spectrum: greedy constrained decoding methods enforce validity during decoding but distort the LM's distribution, while rejection sampling (RS) preserves fidelity but wastes computation by discarding invalid outputs. Both extremes are problematic in domains such as program fuzzing, where both validity and diversity of samples are essential. We present Constrained Adaptive Rejection Sampling (CARS), an approach that strictly improves the sample-efficiency of RS without distributional distortion. CARS begins with unconstrained LM sampling and adaptively rules out constraint-violating continuations by recording them in a trie and subtracting their probability mass from future draws. This adaptive pruning ensures that prefixes proven invalid are never revisited, acceptance rates improve monotonically, and the resulting samples exactly follow the constrained distribution. In experiments on a variety of domains -- e.g., program fuzzing and molecular generation -- CARS consistently achieves higher efficiency -- measured in the number of LM forward passes per valid sample -- while also producing stronger sample diversity than both GCD and methods that approximate the LM's distribution.

Problem

Research questions and friction points this paper is trying to address.

Enforcing strict constraints while preserving language model distribution fidelity

Improving sample efficiency in constrained generation without computational waste

Balancing validity and diversity in domains like program fuzzing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptively prunes invalid prefixes using trie

Subtracts probability mass from future draws

Improves efficiency without distributional distortion

🔎 Similar Papers

No similar papers found.

Authors to Follow