Contextual Drag: How Errors in the Context Affect LLM Reasoning

📅 2026-02-04

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

This work identifies and formally names a previously unrecognized phenomenon—“context dragging”—where large language models, during self-reflection, are prone to propagate structurally similar reasoning errors from flawed contextual cues, leading to performance degradation or even self-amplified deterioration. The study systematically evaluates the prevalence and persistence of this issue across 11 models and 8 reasoning tasks. Using tree edit distance for structured error analysis, the authors assess mitigation strategies such as rollback fine-tuning and context denoising. Findings reveal that context dragging can impair performance by 10–20%, and current approaches fail to fully eliminate its effects, exposing a fundamental flaw in existing reasoning architectures.

Technology Category

Application Category

📝 Abstract

Central to many self-improvement pipelines for large language models (LLMs) is the assumption that models can improve by reflecting on past mistakes. We study a phenomenon termed contextual drag: the presence of failed attempts in the context biases subsequent generations toward structurally similar errors. Across evaluations of 11 proprietary and open-weight models on 8 reasoning tasks, contextual drag induces 10-20% performance drops, and iterative self-refinement in models with severe contextual drag can collapse into self-deterioration. Structural analysis using tree edit distance reveals that subsequent reasoning trajectories inherit structurally similar error patterns from the context. We demonstrate that neither external feedback nor successful self-verification suffices to eliminate this effect. While mitigation strategies such as fallback-behavior fine-tuning and context denoising yield partial improvements, they fail to fully restore baseline performance, positioning contextual drag as a persistent failure mode in current reasoning architectures.

Problem

Research questions and friction points this paper is trying to address.

contextual drag

LLM reasoning

error propagation

self-improvement

reasoning failure

Innovation

Methods, ideas, or system contributions that make the work stand out.

contextual drag

self-improvement

reasoning errors