ReCUT: Balancing Reasoning Length and Accuracy in LLMs via Stepwise Trails and Preference Optimization

📅 2025-06-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) often suffer from verbose and inefficient chain-of-thought (CoT) reasoning due to excessive deliberation; existing approaches—such as multi-path distillation or preference learning—are prone to overfitting and heavily reliant on high-quality synthetic data. Method: We propose a stepwise reasoning compression framework that employs long–short switching sampling to generate diverse reasoning trajectories, constructs dual-objective preference pairs (accuracy vs. length), trains separate high-accuracy and short-length models, and combines them via parameter interpolation to yield a balanced model. Contribution/Results: Our method is the first to decouple accuracy and length optimization, eliminating dependence on curated synthetic data and mitigating overfitting. Experiments across multiple mathematical reasoning benchmarks show 30–50% reduction in reasoning length while maintaining or improving accuracy, with consistent performance across diverse backbone architectures. Code and data are publicly released.

Technology Category

Application Category

📝 Abstract
Recent advances in Chain-of-Thought (CoT) prompting have substantially improved the reasoning capabilities of Large Language Models (LLMs). However, these methods often suffer from overthinking, leading to unnecessarily lengthy or redundant reasoning traces. Existing approaches attempt to mitigate this issue through curating multiple reasoning chains for training LLMs, but their effectiveness is often constrained by the quality of the generated data and prone to overfitting. To address the challenge, we propose Reasoning Compression ThroUgh Stepwise Trials (ReCUT), a novel method aimed at balancing the accuracy and length of reasoning trajectory. Specifically, ReCUT employs a stepwise exploration mechanism and a long-short switched sampling strategy, enabling LLMs to incrementally generate diverse reasoning paths. These paths are evaluated and used to construct preference pairs to train two specialized models (Gemini LLMs)-one optimized for reasoning accuracy, the other for shorter reasoning. A final integrated model is obtained by interpolating the parameters of these two models. Experimental results across multiple math reasoning datasets and backbone models demonstrate that ReCUT significantly reduces reasoning lengths by approximately 30-50%, while maintaining or improving reasoning accuracy compared to various baselines. All codes and data will be released via https://github.com/NEUIR/ReCUT.
Problem

Research questions and friction points this paper is trying to address.

Balancing reasoning length and accuracy in LLMs
Reducing overthinking in Chain-of-Thought prompting
Improving reasoning efficiency without sacrificing accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Stepwise exploration for diverse reasoning paths
Long-short switched sampling strategy
Parameter interpolation of accuracy and brevity models
🔎 Similar Papers
No similar papers found.
Z
Zhensheng Jin
School of Computer Science and Engineering, Northeastern University, China
X
Xinze Li
School of Computer Science and Engineering, Northeastern University, China
Y
Yifan Ji
School of Computer Science and Engineering, Northeastern University, China
Chunyi Peng
Chunyi Peng
Computer Science, Purdue University
wireless networking5Gmobile systemsmobile computingnetwork security
Zhenghao Liu
Zhenghao Liu
Northeastern University
NLPInformation Retrieval
Q
Qi Shi
Department of Computer Science and Technology, Institute for AI, Tsinghua University, China Beijing National Research Center for Information Science and Technology, China
Yukun Yan
Yukun Yan
Tsinghua University
Large Language Model
S
Shuo Wang
Department of Computer Science and Technology, Institute for AI, Tsinghua University, China Beijing National Research Center for Information Science and Technology, China
F
Furong Peng
Institute of Big Data Science and Industry/School of Computer and Information Technology, Shanxi University, China
G
Ge Yu
School of Computer Science and Engineering, Northeastern University, China