Dense Contexts Are Hard Contexts: Lexical Density Limits Effective Context in LLMs

📅 2026-06-04

📈 Citations: 0

✨ Influential: 0

career value

155K/year

🤖 AI Summary

This study addresses the degradation of key information retrieval performance in large language models when processing long texts with high lexical density, which significantly compresses their effective context window. For the first time, lexical density is identified as a critical factor limiting effective context length. The authors construct a “needle-in-a-haystack” benchmark where text length and needle position remain fixed while lexical density systematically increases, enabling controlled evaluation of its impact. Experiments across open-source models spanning 9B to 685B parameters reveal that retrieval accuracy drops sharply below 60% under high lexical density, whereas reducing density substantially restores performance. These findings uncover lexical density as a previously overlooked third dimension—alongside input length and information position—that constrains model capabilities in long-context reasoning.

📝 Abstract

Input length and the position of relevant information are widely cited as the primary causes of degraded LLM long-context performance. Here, we study lexical density -- the rate at which a context introduces distinct information -- as a third, largely overlooked factor that systematically reduces the effective context window of LLMs. We quantify the impact of lexical density on open-weight LLMs (9B-685B) using three "find-the-needle" style benchmarks with identical length (~12k tokens) and controlled needle position, but increasing density of information. We observe a sharp performance collapse in higher-density benchmarks: models that are near-perfect in sparse contexts drop below 60% retrieval score on denser ones. To rule out task-type confounds, we vary and control the density within each benchmark while keeping all other properties unchanged. Reducing density generally restores performance, especially in the high-density regimes where degradation appears. These results show that effective context capacity is a function of lexical density, with direct implications for real-world LLM systems operating on compact, information-rich inputs.

Problem

Research questions and friction points this paper is trying to address.

lexical density

long-context performance

effective context window

large language models

information density

Innovation

Methods, ideas, or system contributions that make the work stand out.

lexical density

effective context window

long-context performance