🤖 AI Summary
Instruction tuning substantially reduces output diversity in large language models (LLMs), hindering creative text generation. This work systematically quantifies the “diversity gap” across multiple open-source LLMs on narrative generation tasks and identifies the DPO fine-tuning stage as the primary source of diversity degradation. To address this, we propose **Conformal Decoding**, an inference-time technique that leverages the base model’s token probability distribution to guide sampling in instruction-tuned models, thereby restoring lexical and syntactic diversity. Experiments on OLMo and OLMo 2 demonstrate that our method increases n-gram entropy by +23.6%—a strong indicator of enhanced diversity—while preserving or even improving generation quality (BLEU +1.4, human evaluation +0.8/5). Crucially, Conformal Decoding requires no retraining, is computationally lightweight, and scales seamlessly across models, offering a practical, deployable solution for high-creativity generation tasks.
📝 Abstract
Instruction-tuning large language models (LLMs) reduces the diversity of their outputs, which has implications for many tasks, particularly for creative tasks. This paper investigates the ``diversity gap'' for a writing prompt narrative generation task. This gap emerges as measured by current diversity metrics for various open-weight and open-source LLMs. The results show significant decreases in diversity due to instruction-tuning. We explore the diversity loss at each fine-tuning stage for the OLMo and OLMo 2 models to further understand how output diversity is affected. The results indicate that DPO has the most substantial impact on diversity. Motivated by these findings, we present a new decoding strategy, conformative decoding, which guides an instruct model using its more diverse base model to reintroduce output diversity. We show that conformative decoding typically increases diversity and even maintains or improves quality.