Order Independence With Finetuning

📅 2025-03-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) exhibit order sensitivity in multiple-choice question answering, yielding inconsistent predictions for semantically equivalent inputs with permuted answer options—revealing a critical robustness deficiency. To address this, we propose Set-Based Prompting (SBP) explicitly integrated into supervised fine-tuning, enabling the first joint optimization of SBP and model parameters. This co-training strategy endows the model with intrinsic order invariance while preserving general language modeling capabilities. Crucially, unlike conventional SBP applied only at inference—where it induces distributional shift—we embed SBP within the training manifold, ensuring compatibility with standard training dynamics. Evaluations across MMLU, CSQA, and ARC Challenge demonstrate substantial accuracy gains alongside strong order invariance under answer reordering. The method thus achieves simultaneous improvements in in-distribution performance and structural robustness.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) demonstrate remarkable performance on many NLP tasks, yet often exhibit order dependence: simply reordering semantically identical tokens (e.g., answer choices in multiple-choice questions) can lead to inconsistent predictions. Recent work proposes Set-Based Prompting (SBP) as a way to remove order information from designated token subsets, thereby mitigating positional biases. However, applying SBP on base models induces an out-of-distribution input format, which can degrade in-distribution performance. We introduce a fine-tuning strategy that integrates SBP into the training process,"pulling"these set-formatted prompts closer to the model's training manifold. We show that SBP can be incorporated into a model via fine-tuning. Our experiments on in-distribution (MMLU) and out-of-distribution (CSQA, ARC Challenge) multiple-choice tasks show that SBP fine-tuning significantly improves accuracy and robustness to answer-order permutations, all while preserving broader language modeling capabilities. We discuss the broader implications of order-invariant modeling and outline future directions for building fairer, more consistent LLMs.
Problem

Research questions and friction points this paper is trying to address.

Mitigating order dependence in LLM predictions
Improving robustness to answer-order permutations
Preserving language modeling capabilities during fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuning integrates Set-Based Prompting (SBP)
SBP mitigates positional biases in LLMs
Improves accuracy and order robustness
🔎 Similar Papers
No similar papers found.