SEAL: Scaling to Emphasize Attention for Long-Context Retrieval

📅 2025-01-25

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Large language models suffer from markedly degraded information retrieval accuracy in ultra-long contexts (e.g., ten-thousand-token documents). To address this, we propose a training-free attention head adaptive scaling mechanism. We first empirically observe substantial heterogeneity across attention heads in their contribution to long-range retrieval and quantify head–retrieval correlation via zero-shot generated data. Building on this insight, we introduce learnable scaling weights that dynamically amplify critical heads and suppress redundant ones during inference. Our method requires no fine-tuning and exhibits strong in-domain generalization and cross-domain robustness. On LongBench document QA benchmarks, it substantially improves retrieval accuracy—both in-domain and out-of-domain—with consistent gains. Moreover, it is fully compatible with mainstream context extension techniques, effectively extending the usable context window while preserving output reliability.

Technology Category

Application Category

📝 Abstract

In this work, we introduce a novel approach called Scaling to Emphasize Attention for Long-context retrieval (SEAL), which enhances the retrieval performance of large language models (LLMs) over extended contexts. Previous studies have shown that each attention head in LLMs has a unique functionality and collectively contributes to the overall behavior of the model. Similarly, we observe that specific heads are closely tied to long-context retrieval, showing positive or negative correlation with retrieval scores. Built on this insight, we propose a learning-based mechanism using zero-shot generated data to emphasize these heads, improving the model's performance in long-context retrieval tasks. By applying SEAL, we can achieve significant improvements in in-domain retrieval performance, including document QA tasks from LongBench, and considerable improvements in out-of-domain cases. Additionally, when combined with existing training-free context extension techniques, SEAL extends the context limits of LLMs while maintaining highly reliable outputs, opening new avenues for research in this field.

Problem

Research questions and friction points this paper is trying to address.

Long Context

Information Retrieval

Large Language Models

Innovation

Methods, ideas, or system contributions that make the work stand out.

SEAL method

Enhanced long text retrieval

Improved accuracy without pre-training

🔎 Similar Papers

Long-Sequence Recommendation Models Need Decoupled Embeddings

2024-10-03arXiv.orgCitations: 0

Authors to Follow