🤖 AI Summary
Large language models often suffer from insufficient reliability in multi-step reasoning tasks due to error accumulation. This work proposes a dynamic intervention mechanism based on the entropy of the output distribution: when entropy during generation becomes excessively high—indicating high uncertainty—the autoregressive process is temporarily halted, and a local search is performed in the latent space over the current reasoning path, followed by refinement via Soft Reasoning. For the first time, output entropy is leveraged as a signal to trigger localized optimization. The approach achieves or surpasses state-of-the-art Soft Reasoning performance on four benchmarks—GSM8K, GSM-Hard, SVAMP, and StrategyQA—demonstrating the effectiveness of entropy-guided, selective reasoning refinement.
📝 Abstract
The use of Large Language Models (LLMs) for reasoning and planning tasks has drawn increasing attention in Artificial Intelligence research. Despite their remarkable progress, these models still exhibit limitations in multi-step inference scenarios, particularly in mathematical and logical reasoning. We introduce PREGU (Partial Reasoning Guided by Uncertainty). PREGU monitors the entropy of the output distribution during autoregressive generation and halts the process whenever entropy exceeds a defined threshold, signaling uncertainty. From that point, a localized search is performed in the latent space to refine the partial reasoning and select the most coherent answer, using the Soft Reasoning method. Experiments conducted with LLaMA-3-8B, Mistral-7B, and Qwen2-7B across four reasoning benchmarks (GSM8K, GSM-Hard, SVAMP, and StrategyQA) showed performance greater than or similar to Soft Reasoning, indicating that entropy can serve as an effective signal to trigger selective refinement during reasoning.