SEAL: Scaling to Emphasize Attention for Long-Context Retrieval

📅 2025-01-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models suffer from markedly degraded information retrieval accuracy in ultra-long contexts (e.g., ten-thousand-token documents). To address this, we propose a training-free attention head adaptive scaling mechanism. We first empirically observe substantial heterogeneity across attention heads in their contribution to long-range retrieval and quantify head–retrieval correlation via zero-shot generated data. Building on this insight, we introduce learnable scaling weights that dynamically amplify critical heads and suppress redundant ones during inference. Our method requires no fine-tuning and exhibits strong in-domain generalization and cross-domain robustness. On LongBench document QA benchmarks, it substantially improves retrieval accuracy—both in-domain and out-of-domain—with consistent gains. Moreover, it is fully compatible with mainstream context extension techniques, effectively extending the usable context window while preserving output reliability.

Technology Category

Application Category

📝 Abstract
In this work, we introduce a novel approach called Scaling to Emphasize Attention for Long-context retrieval (SEAL), which enhances the retrieval performance of large language models (LLMs) over extended contexts. Previous studies have shown that each attention head in LLMs has a unique functionality and collectively contributes to the overall behavior of the model. Similarly, we observe that specific heads are closely tied to long-context retrieval, showing positive or negative correlation with retrieval scores. Built on this insight, we propose a learning-based mechanism using zero-shot generated data to emphasize these heads, improving the model's performance in long-context retrieval tasks. By applying SEAL, we can achieve significant improvements in in-domain retrieval performance, including document QA tasks from LongBench, and considerable improvements in out-of-domain cases. Additionally, when combined with existing training-free context extension techniques, SEAL extends the context limits of LLMs while maintaining highly reliable outputs, opening new avenues for research in this field.
Problem

Research questions and friction points this paper is trying to address.

Long Context
Information Retrieval
Large Language Models
Innovation

Methods, ideas, or system contributions that make the work stand out.

SEAL method
Enhanced long text retrieval
Improved accuracy without pre-training
🔎 Similar Papers
C
Changhun Lee
Department of Convergence IT Engineering, Pohang University of Science and Technology (POSTECH)
Jun-gyu Jin
Jun-gyu Jin
POSTECH
AI
Y
Younghyun Cho
Graduate School of Artificial Intelligence, Pohang University of Science and Technology (POSTECH)
Eunhyeok Park
Eunhyeok Park
POSTECH
neural network optimizationenergy efficient hardware design