Positional Fragility in LLMs: How Offset Effects Reshape Our Understanding of Memorization Risks

📅 2025-05-19

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This work identifies a “positional vulnerability” in large language models (LLMs): verbatim memorization is highly sensitive to the starting position of contextual sequences, with even minor positional shifts causing sharp degradation in recall—termed the “offset effect.” Using de novo pretraining of 1B/3B/8B models on an 83B-token mixed corpus—including controlled injection of public-domain books to simulate copyrighted material—we conduct long-sequence probing (≥10× prior lengths) and systematic offset-robustness analysis. We establish that the initial token serves as a critical retrieval anchor. Introducing “offset sensitivity” as a novel dimension for quantifying memorization risk, we challenge the conventional assumption of uniform prefix-based probing. Furthermore, we find that shifting sensitive content toward later positions simultaneously reduces extractability and mitigates text degeneration—yielding actionable strategies for data sanitization and compliant model deployment.

Technology Category

Application Category

📝 Abstract

Large language models are known to memorize parts of their training data, posing risk of copyright violations. To systematically examine this risk, we pretrain language models (1B/3B/8B) from scratch on 83B tokens, mixing web-scale data with public domain books used to simulate copyrighted content at controlled frequencies at lengths at least ten times longer than prior work. We thereby identified the offset effect, a phenomenon characterized by two key findings: (1) verbatim memorization is most strongly triggered by short prefixes drawn from the beginning of the context window, with memorization decreasing counterintuitively as prefix length increases; and (2) a sharp decline in verbatim recall when prefix begins offset from the initial tokens of the context window. We attribute this to positional fragility: models rely disproportionately on the earliest tokens in their context window as retrieval anchors, making them sensitive to even slight shifts. We further observe that when the model fails to retrieve memorized content, it often produces degenerated text. Leveraging these findings, we show that shifting sensitive data deeper into the context window suppresses both extractable memorization and degeneration. Our results suggest that positional offset is a critical and previously overlooked axis for evaluating memorization risks, since prior work implicitly assumed uniformity by probing only from the beginning of training sequences.

Problem

Research questions and friction points this paper is trying to address.

Examines how positional offset affects LLM memorization risks

Identifies offset effect reducing verbatim recall with shifted prefixes

Proposes shifting sensitive data to suppress memorization and degeneration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Pretrain models with mixed data to simulate copyright risks

Identify offset effect reducing memorization with shifted prefixes

Suppress memorization by shifting sensitive data deeper

🔎 Similar Papers

The Remarkable Robustness of LLMs: Stages of Inference?