Thinking Economically: A Hierarchical Framework for Adaptive-Complexity Reasoning in LLMs

📅 2026-05-31

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This work addresses the inefficiency of large language models (LLMs) in reasoning, where excessive computation—often termed “overthinking”—leads to wasted resources. Existing approaches apply uniform compression strategies that neglect variations in reasoning complexity both across problems and within individual reasoning steps. To overcome this limitation, the authors propose an “Economical Reasoning” framework featuring a hierarchical adaptive budgeting mechanism: at the problem level, it predicts the optimal reasoning depth; at the step level, it dynamically allocates token budgets via perplexity-based comparisons and Pareto optimization, while leveraging Fisher information pruning to guide the generator toward efficient reasoning patterns. This approach achieves the first dual-granularity, fine-grained resource allocation scheme, explicitly modeling the quality–efficiency trade-off as a locally adaptive objective. Experiments on GSM8K and MATH500 demonstrate simultaneous improvements in accuracy and reductions in token consumption, significantly outperforming standard chain-of-thought and other baselines.

📝 Abstract

Chain-of-Thought (CoT) has significantly enhanced LLM reasoning, yet often incurs substantial computational overhead due to "overthinking": generating excessively long rationales without commensurate accuracy gains. Existing efficiency methods typically apply uniform compression, which overlooks a critical observation that reasoning complexity is heterogeneous at two distinct granularity: across different problems and within individual reasoning steps. This motivates our principle of Thinking Economically: intelligently allocating computational resources based on intrinsic task and step demands rather than pursuing uniform brevity. We propose Hierarchical Adaptive Budgeter (HAB), a training framework that operationalizes this principle through coarse-to-fine budgeting. At the inter-step level, HAB predicts the optimal reasoning depth for each problem. At the intra-step level, HAB learns step-specific token budgeting signals from PPL-derived step comparisons and an adaptive Pareto optimization objective that captures the local quality-efficiency trade-off, while a Fisher Information-based pruner further provides fine-grained training-time guidance, thereby encouraging the generator to internalize more economical reasoning patterns. Experiments on GSM8K and MATH500 show that HAB not only surpasses standard CoT in accuracy but also reduces token usage, achieving a stronger performance-efficiency trade-off than the compared baselines.

Problem

Research questions and friction points this paper is trying to address.

Chain-of-Thought

computational overhead

reasoning complexity

efficiency

adaptive budgeting

Innovation

Methods, ideas, or system contributions that make the work stand out.

adaptive reasoning

computational efficiency

hierarchical budgeting