Suffix-Constrained Greedy Search Algorithms for Causal Language Models

📅 2026-03-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses a critical challenge in information extraction from large language models: their freely generated outputs often lack consistent formatting, making reliable answer extraction difficult. To tackle this issue, the paper introduces suffix constraints into the generation process of causal language models for the first time, proposing a greedy search–based constrained generation algorithm. By enforcing model outputs to conform to predefined templates, the method enables deterministic parsing of answers while preserving the autoregressive inference mechanism. Experimental results across multiple datasets demonstrate that this approach not only guarantees parseable outputs but also maintains or even improves model accuracy on tasks such as mathematical question answering.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) are powerful tools that have found applications beyond human-machine interfaces and chatbots. In particular, their ability to generate reasoning traces motivated their use in many prediction tasks like math question answering. Unfortunately, extracting the final answer in an LLM free-form output is difficult, as it is an information extraction problem on its own. In this work, we introduce suffix-constrained generation, that aims to produce well-formed LLM responses in which final answers follow strict templates and are guaranteed to be trivially parseable. To this end, we introduce several algorithms that are based on greedy search procedures. We experiment on several datasets, and show that our approach allows to guarantee trivial deterministic extraction of the final answer from an LLM output without having a negative impact on results, and even improving them.
Problem

Research questions and friction points this paper is trying to address.

large language models
answer extraction
information extraction
reasoning traces
final answer parsing
Innovation

Methods, ideas, or system contributions that make the work stand out.

suffix-constrained generation
greedy search
causal language models
answer parsing
structured output