DenoiseFlow: Uncertainty-Aware Denoising for Reliable LLM Agentic Workflows

📅 2026-02-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the degradation in reliability of large language models within long-sequence agent workflows caused by accumulating semantic ambiguities. To mitigate this, the authors propose a three-stage closed-loop framework that models multi-step reasoning as a noisy Markov decision process, enabling progressive denoising through perception, regulation, and correction. Key innovations include an uncertainty-aware adaptive computation allocation mechanism, an unsupervised online self-calibration method, and a synergistic integration of semantic uncertainty estimation, adaptive path exploration, impact-analysis-driven error correction, and verifier feedback alignment. Evaluated across six benchmarks, the approach achieves an average accuracy of 83.3%, outperforming the strongest baseline by 1.3%, while reducing computational overhead by 40–56% through dynamic branching.

Technology Category

Application Category

📝 Abstract
Autonomous agents are increasingly entrusted with complex, long-horizon tasks, ranging from mathematical reasoning to software generation. While agentic workflows facilitate these tasks by decomposing them into multi-step reasoning chains, reliability degrades significantly as the sequence lengthens. Specifically, minor interpretation errors in natural-language instructions tend to compound silently across steps. We term this failure mode accumulated semantic ambiguity. Existing approaches to mitigate this often lack runtime adaptivity, relying instead on static exploration budgets, reactive error recovery, or single-path execution that ignores uncertainty entirely. We formalize the multi-step reasoning process as a Noisy MDP and propose DenoiseFlow, a closed-loop framework that performs progressive denoising through three coordinated stages: (1)Sensing estimates per-step semantic uncertainty; (2)Regulating adaptively allocates computation by routing between fast single-path execution and parallel exploration based on estimated risk; and (3)Correcting performs targeted recovery via influence-based root-cause localization. Online self-calibration continuously aligns decision boundaries with verifier feedback, requiring no ground-truth labels. Experiments on six benchmarks spanning mathematical reasoning, code generation, and multi-hop QA show that DenoiseFlow achieves the highest accuracy on every benchmark (83.3% average, +1.3% over the strongest baseline) while reducing cost by 40--56% through adaptive branching. Detailed ablation studies further confirm framework-level's robustness and generality. Code is available at https://anonymous.4open.science/r/DenoiseFlow-21D3/.
Problem

Research questions and friction points this paper is trying to address.

semantic ambiguity
LLM agentic workflows
multi-step reasoning
error accumulation
reliability degradation
Innovation

Methods, ideas, or system contributions that make the work stand out.

DenoiseFlow
uncertainty-aware denoising
Noisy MDP
adaptive branching
semantic ambiguity
🔎 Similar Papers
No similar papers found.
Y
Yandong Yan
School of Computer Science, Peking University
J
Junwei Peng
School of Electronics Engineering and Computer Science, Peking University
S
Shijie Li
China Southern Power Grid
Chenxi Li
Chenxi Li
University of Glasgow, University of Electronic Science and Technology of China
Trustworthy LVLMsAI for Science&Industry
Y
Yifei Shang
School of Electronics Engineering and Computer Science, Peking University
C
Can Deng
Tsinghua University
R
Ruiting Dai
University of Electronic Science and Technology of China
Y
Yongqiang Zhao
Key Laboratory of High Confidence Software Technologies (PKU), MOE
Jiaqi Zhu
Jiaqi Zhu
Institute of Software, Chinese Academy of Sciences
Web MiningAnomaly DetectionKnowledge Graph
Y
Yu Huang
National Engineering Research Center for Software Engineering, Peking University