🤖 AI Summary
This work addresses the performance degradation of large language models in long-horizon information retrieval tasks, commonly attributed to “context corruption” caused by accumulating irrelevant context. To mitigate this issue, the authors propose an active, reflection-driven context management framework that treats context as a dynamic reasoning state. The framework employs real-time monitoring and revision mechanisms to assess task relevance and actively reconstruct the working context. This approach uniquely models context management as an ongoing, reflective process during execution, moving beyond conventional static compression or passive summarization strategies. Evaluated on benchmarks such as BrowseComp-ZH using Qwen2.5-32B-Instruct, the method achieves up to an 11% absolute improvement in accuracy over passive compression baselines.
📝 Abstract
Large language models are increasingly deployed as research agents for deep search and long-horizon information seeking, yet their performance often degrades as interaction histories grow. This degradation, known as context rot, reflects a failure to maintain coherent and task-relevant internal states over extended reasoning horizons. Existing approaches primarily manage context through raw accumulation or passive summarization, treating it as a static artifact and allowing early errors or misplaced emphasis to persist. Motivated by this perspective, we propose ARC, which is the first framework to systematically formulate context management as an active, reflection-driven process that treats context as a dynamic internal reasoning state during execution. ARC operationalizes this view through reflection-driven monitoring and revision, allowing agents to actively reorganize their working context when misalignment or degradation is detected. Experiments on challenging long-horizon information-seeking benchmarks show that ARC consistently outperforms passive context compression methods, achieving up to an 11% absolute improvement in accuracy on BrowseComp-ZH with Qwen2.5-32B-Instruct.