RAM-Net: Expressive Linear Attention with Selectively Addressable Memory

📅 2026-02-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes RAM-Net, a novel linear attention architecture that addresses the limited expressivity and information loss inherent in conventional linear attention models, which compress unbounded history into fixed-size memory. By mapping inputs to high-dimensional sparse vectors that serve as explicit memory addresses, RAM-Net enables selective access to an exponentially expanded state space without introducing any additional parameters. This approach maintains linear computational complexity while significantly reducing signal interference and enhancing retrieval fidelity. Empirical evaluations demonstrate that RAM-Net outperforms state-of-the-art methods on fine-grained long-range retrieval tasks and achieves competitive performance on language modeling and zero-shot commonsense reasoning benchmarks.

Technology Category

Application Category

📝 Abstract
While linear attention architectures offer efficient inference, compressing unbounded history into a fixed-size memory inherently limits expressivity and causes information loss. To address this limitation, we introduce Random Access Memory Network (RAM-Net), a novel architecture designed to bridge the gap between the representational capacity of full attention and the memory efficiency of linear models. The core of RAM-Net maps inputs to high-dimensional sparse vectors serving as explicit addresses, allowing the model to selectively access a massive memory state. This design enables exponential state size scaling without additional parameters, which significantly mitigates signal interference and enhances retrieval fidelity. Moreover, the inherent sparsity ensures exceptional computational efficiency, as state updates are confined to minimal entries. Extensive experiments demonstrate that RAM-Net consistently surpasses state-of-the-art baselines in fine-grained long-range retrieval tasks and achieves competitive performance in standard language modeling and zero-shot commonsense reasoning benchmarks, validating its superior capability to capture complex dependencies with significantly reduced computational overhead.
Problem

Research questions and friction points this paper is trying to address.

linear attention
memory compression
information loss
expressivity limitation
long-range dependency
Innovation

Methods, ideas, or system contributions that make the work stand out.

linear attention
selective memory access
sparse addressing
memory efficiency
long-range dependency
🔎 Similar Papers
No similar papers found.
K
Kaicheng Xiao
The Chinese University of Hong Kong
H
Haotian Li
The Chinese University of Hong Kong
L
Liran Dong
The Chinese University of Hong Kong
Guoliang Xing
Guoliang Xing
The Chinese University of Hong Kong
Embedded AIAI for HealthAutonomous DrivingCyber-Physical SystemsWireless Networks