Chain of Simulation: A Dual-Mode Reasoning Framework for Large Language Models with Dynamic Problem Routing

📅 2026-02-02

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Large language models struggle to dynamically select optimal reasoning strategies for diverse tasks, limiting their performance. To address this, this work proposes the Chain-of-Simulation (CoS) framework, which introduces the first dynamic dual-mode reasoning router conditioned on problem type: it employs self-consistent computational streams for mathematical problems, JSON-based symbolic state tracking for spatial reasoning, and hybrid fact extraction for multi-hop reasoning. Notably, CoS requires no additional training and achieves accuracies of 71.5%, 90.0%, and 19.0% on GSM8K, StrategyQA, and bAbI, respectively—yielding relative improvements of 1.0%, 2.5%, and 65.2% over the strongest baselines—while simultaneously reducing computational cost by 46%.

Technology Category

Application Category

📝 Abstract

We present Chain of Simulation (CoS), a novel dual-mode reasoning framework that dynamically routes problems to specialized reasoning strategies in Large Language Models (LLMs). Unlike existing uniform prompting approaches, CoS employs three distinct reasoning modes: (1) computational flow with self-consistency for mathematical problems, (2) symbolic state tracking with JSON representations for spatial reasoning, and (3) hybrid fact-extraction for multi-hop inference. Through comprehensive evaluation on GSM8K, StrategyQA, and bAbI benchmarks using four state-of-the-art models (Gemma-3 27B, LLaMA-3.1 8B, Mistral 7B, and Qwen-2.5 14B), we demonstrate that CoS achieves 71.5% accuracy on GSM8K (1.0% absolute improvement), 90.0% on StrategyQA (2.5% improvement), and 19.0% on bAbI (65.2% relative improvement) compared to the strongest baselines. The analysis reveals that problem-specific mode selection is crucial, with computational mode achieving 81.2% accuracy when correctly applied to mathematical problems, while misrouting leads to 0% accuracy. We provide detailed algorithms for mode selection, state tracking, and answer extraction, establishing CoS as an effective approach for improving LLM reasoning without additional training. The framework provides superior trade-offs between accuracy and efficiency compared to Self-Consistency, achieving comparable performance at 54% lower computational cost.

Problem

Research questions and friction points this paper is trying to address.

reasoning strategies

problem routing

large language models

dynamic reasoning

multi-hop inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Chain of Simulation

Dynamic Problem Routing

Dual-Mode Reasoning