Mixture of Reasonings: Teach Large Language Models to Reason with Adaptive Strategies

📅 2025-07-01

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

To address the limitations of large language models (LLMs)—including reliance on manually engineered task prompts, poor generalization across reasoning tasks, and suboptimal inference efficiency—this paper proposes the Mixture of Reasoning (MoR) framework. MoR is the first approach to internalize diverse reasoning strategies directly into model parameters, enabling prompt-free, task-adaptive reasoning without external prompting engineering. Methodologically, it employs a two-stage data construction pipeline: (1) high-quality chain-of-thought (CoT) templates are generated using GPT-4o, and (2) these templates are paired with benchmark tasks for supervised fine-tuning (SFT). Evaluated under standard CoT prompting, MoR150 achieves 0.730 accuracy on multi-task reasoning benchmarks—outperforming strong baselines by 13.5 percentage points in relative gain (absolute improvement of +2.2%). This demonstrates substantial gains in cross-task generalization and reasoning robustness.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) excel in complex tasks through advanced prompting techniques like Chain-of-Thought (CoT) and Tree-of-Thought (ToT), but their reliance on manually crafted, task-specific prompts limits adaptability and efficiency. We introduce Mixture of Reasoning (MoR), a training framework that embeds diverse reasoning strategies into LLMs for autonomous, task-adaptive reasoning without external prompt engineering. MoR has two phases: Thought Generation, creating reasoning chain templates with models like GPT-4o, and SFT Dataset Construction, pairing templates with benchmark datasets for supervised fine-tuning.Our experiments show that MoR significantly enhances performance, with MoR150 achieving 0.730 (2.2% improvement) using CoT prompting and 0.734 (13.5% improvement) compared to baselines. MoR eliminates the need for task-specific prompts, offering a generalizable solution for robust reasoning across diverse tasks.

Problem

Research questions and friction points this paper is trying to address.

Enables LLMs to autonomously adapt reasoning strategies without manual prompts

Integrates diverse reasoning methods into a single training framework

Improves performance across tasks by eliminating task-specific prompt engineering

Innovation

Methods, ideas, or system contributions that make the work stand out.

Embed diverse reasoning strategies into LLMs

Autonomous task-adaptive reasoning without prompts

Enhance performance with supervised fine-tuning

🔎 Similar Papers

Semantic Self-Consistency: Enhancing Language Model Reasoning via Semantic Weighting