Emergence of Context Characteristics Sensitivity in Large Language Models

📅 2026-06-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates how large language models acquire and dynamically adjust their sensitivity to contextual features—such as length, query similarity, and fluency—during instruction tuning. By systematically comparing contextual usage behaviors across supervised fine-tuning (SFT), direct preference optimization (DPO), and reinforcement learning with verifiable rewards (RLVR) on four models and three datasets, the work reveals for the first time that models actively reshape their contextual preferences throughout fine-tuning: SFT tends to favor easily interpretable contexts, while subsequent stages may either amplify or mitigate this bias. The findings underscore the decisive role of training data composition in shaping a model’s ultimate capacity for context utilization and highlight the critical importance of balanced data design in enhancing robustness.
📝 Abstract
During instruction fine-tuning (IFT), large language models (LLMs) learn to follow instructions by using the provided context to answer a query. While prior work has studied how context characteristics correlate with context usage by the LLM, this analysis has been limited to inference time, leaving open how these relationships are acquired in the first place. Here, we measure how models' sensitivity to such characteristics shifts across successive IFT stages: supervised fine-tuning (SFT), direct preference optimization (DPO), and reinforcement learning with verifiable rewards (RLVR). Experiments across four models and three datasets show that SFT makes models more likely to use contexts that are easy to understand, such as containing high length, context-query similarity, and fluency. Post-SFT dynamics may either reinforce or resolve these preferences depending on the training dataset. Our findings reveal that context usage is actively reshaped at each IFT stage, and designing a balanced IFT dataset is important in ensuring robust context utilization of instruction-tuned models.
Problem

Research questions and friction points this paper is trying to address.

context characteristics
instruction fine-tuning
large language models
sensitivity
context usage
Innovation

Methods, ideas, or system contributions that make the work stand out.

instruction fine-tuning
context sensitivity
supervised fine-tuning
preference optimization
reinforcement learning
🔎 Similar Papers
No similar papers found.