Recurrent Confidence Chain: Temporal-Aware Uncertainty Quantification in Large Language Models

📅 2026-01-19

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

This work addresses the overconfidence of large language models in reasoning, which often stems from neglecting early low-confidence steps and leads to hallucinations. To mitigate this, the paper introduces a time-aware confidence propagation mechanism and proposes a recursive chain-of-confidence architecture. This approach leverages cross-step attention to model semantic dependencies among reasoning steps and integrates hidden confidence states with a confidence fusion strategy to effectively capture both decay and accumulation of confidence over long reasoning chains. Evaluated on the GAOKAO-Math and CLadder causal reasoning benchmarks, the method significantly outperforms existing approaches, achieving superior predictive accuracy and better uncertainty calibration as measured by negative log-likelihood and expected calibration error.

Technology Category

Application Category

📝 Abstract

As reasoning modules, such as the chain-of-thought mechanism, are applied to large language models, they achieve strong performance on various tasks such as answering common-sense questions and solving math problems. The main challenge now is to assess the uncertainty of answers, which can help prevent misleading or serious hallucinations for users. Although current methods analyze long reasoning sequences by filtering unrelated tokens and examining potential connections between nearby tokens or sentences, the temporal spread of confidence is often overlooked. This oversight can lead to inflated overall confidence, even when earlier steps exhibit very low confidence. To address this issue, we propose a novel method that incorporates inter-step attention to analyze semantic correlations across steps. For handling long-horizon responses, we introduce a hidden confidence mechanism to retain historical confidence information, which is then combined with stepwise confidence to produce a more accurate overall estimate. We evaluate our method on the GAOKAO math benchmark and the CLadder causal reasoning dataset using mainstream open-source large language models. Our approach is shown to outperform state-of-the-art methods by achieving a superior balance between predictive quality and calibration, demonstrated by strong performance on both Negative Log-Likelihood and Expected Calibration Error.

Problem

Research questions and friction points this paper is trying to address.

uncertainty quantification

large language models

temporal confidence

hallucination

reasoning chains

Innovation

Methods, ideas, or system contributions that make the work stand out.

Temporal-aware uncertainty quantification

Inter-step attention

Hidden confidence mechanism