Enhancing Large Language Model Reasoning via Selective Critical Token Fine-Tuning

📅 2025-10-12

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

Standard supervised fine-tuning (SFT) uniformly penalizes all tokens, degrading output diversity and generalization in mathematical reasoning. Method: We propose selective critical-token fine-tuning, which identifies sparse, causally critical tokens—those whose perturbation alters reasoning correctness—via counterfactual analysis, and applies gradient updates exclusively at these positions while preserving the original token distributions elsewhere to maintain diversity and robustness. The method integrates seamlessly into standard SFT and supports test-time sampling extensions and reinforcement learning initialization. Results: Experiments across three model families (5 models) and 11 mathematical reasoning benchmarks show that fine-tuning fewer than 12% of tokens consistently outperforms full SFT, while increasing output entropy and improving training stability.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) primarily rely on supervised fine-tuning (SFT) as a key method to adapt pre-trained models to domain-specific tasks such as mathematical reasoning. However, standard SFT uniformly penalizes all tokens, neglecting that only a small subset of critical tokens determines reasoning correctness. This uniform supervision often causes reduced output diversity and limited generalization. We propose Critical Token Fine-tuning (CFT), a simple yet effective approach that updates only tokens identified as functionally indispensable via counterfactual perturbations. By focusing gradient signals on these decisive reasoning steps while preserving the diversity of non-critical tokens, CFT can enhance both generation and diversity. Extensive experiments on five models across three families (Qwen, OLMo, LLaMA) and eleven mathematical reasoning benchmarks show that CFT, despite fine-tuning on less than 12% of tokens, consistently outperforms standard SFT. Moreover, CFT enables test-time scaling through improved sampling diversity and provides a stronger initialization for reinforcement learning, sustaining performance gains in later training stages while maintaining higher entropy for better exploration. These results highlight CFT as a practical and general framework for efficient and robust LLM fine-tuning.

Problem

Research questions and friction points this paper is trying to address.

Standard fine-tuning penalizes all tokens uniformly, reducing output diversity

Identifies critical tokens via counterfactual perturbations for focused optimization

Enhances mathematical reasoning performance while maintaining generation diversity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tunes only critical tokens via counterfactual perturbations

Focuses gradients on decisive reasoning steps

Enhances generation diversity while maintaining performance

🔎 Similar Papers

Semantic Self-Consistency: Enhancing Language Model Reasoning via Semantic Weighting