A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

📅 2025-03-27

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This paper addresses key inefficiencies in large reasoning models (LRMs), including redundant reasoning chains, over-analysis of simple queries, and insufficient path exploration for complex tasks. To tackle these issues, we propose the first holistic analytical framework for reasoning efficiency—spanning pretraining, inference, and deployment phases. Our approach introduces cross-stage efficacy metrics and systematically surveys techniques such as sparse reasoning, dynamic termination, path pruning, hierarchical verification, and multimodal compression, emphasizing synergistic optimization of token economy and computational efficiency. We identify six canonical inefficiency patterns and map them to corresponding solution families. Furthermore, we establish a continuously updated GitHub knowledge repository. The framework delivers a reusable methodology for lightweight agent system deployment and advances LRMs from “strong reasoning” toward “efficient strong reasoning.”

Technology Category

Application Category

📝 Abstract

Recent Large Reasoning Models (LRMs), such as DeepSeek-R1 and OpenAI o1, have demonstrated strong performance gains by scaling up the length of Chain-of-Thought (CoT) reasoning during inference. However, a growing concern lies in their tendency to produce excessively long reasoning traces, which are often filled with redundant content (e.g., repeated definitions), over-analysis of simple problems, and superficial exploration of multiple reasoning paths for harder tasks. This inefficiency introduces significant challenges for training, inference, and real-world deployment (e.g., in agent-based systems), where token economy is critical. In this survey, we provide a comprehensive overview of recent efforts aimed at improving reasoning efficiency in LRMs, with a particular focus on the unique challenges that arise in this new paradigm. We identify common patterns of inefficiency, examine methods proposed across the LRM lifecycle, i.e., from pretraining to inference, and discuss promising future directions for research. To support ongoing development, we also maintain a real-time GitHub repository tracking recent progress in the field. We hope this survey serves as a foundation for further exploration and inspires innovation in this rapidly evolving area.

Problem

Research questions and friction points this paper is trying to address.

LRMs produce excessively long redundant reasoning traces

Inefficient reasoning challenges training inference and deployment

Survey identifies inefficiency patterns and improvement methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimizing Chain-of-Thought reasoning efficiency

Reducing redundant content in reasoning traces

Improving token economy in training and inference

🔎 Similar Papers

Semantic Self-Consistency: Enhancing Language Model Reasoning via Semantic Weighting