If Attention Serves as a Cognitive Model of Human Memory Retrieval, What is the Plausible Memory Representation?

📅 2025-02-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates whether attention mechanisms can serve as cognitively plausible models of human memory retrieval, specifically examining whether syntax-guided attention in Transformer Grammar (TG) better accounts for human sentence processing than standard sequence-based Transformers. Method: We propose the “syntax–token dual memory representation” hypothesis and quantify reading difficulty via normalized attention entropy (NAE). TG is introduced— for the first time—into cognitive modeling, integrated with self-paced reading experiments and multilayer attention behavioral analysis. Contribution/Results: TG significantly outperforms standard Transformers in predicting reading times. Crucially, syntax- and token-level attention contributions are statistically independent, providing empirical support for the dual-representation hypothesis. By moving beyond purely sequential modeling, this work demonstrates that syntactic structure plays a critical role in human linguistic memory retrieval, advancing computational cognitive science through theoretically grounded, architecture-informed modeling.

Technology Category

Application Category

📝 Abstract
Recent work in computational psycholinguistics has revealed intriguing parallels between attention mechanisms and human memory retrieval, focusing primarily on Transformer architectures that operate on token-level representations. However, computational psycholinguistic research has also established that syntactic structures provide compelling explanations for human sentence processing that word-level factors alone cannot fully account for. In this study, we investigate whether the attention mechanism of Transformer Grammar (TG), which uniquely operates on syntactic structures as representational units, can serve as a cognitive model of human memory retrieval, using Normalized Attention Entropy (NAE) as a linking hypothesis between model behavior and human processing difficulty. Our experiments demonstrate that TG's attention achieves superior predictive power for self-paced reading times compared to vanilla Transformer's, with further analyses revealing independent contributions from both models. These findings suggest that human sentence processing involves dual memory representations -- one based on syntactic structures and another on token sequences -- with attention serving as the general retrieval algorithm, while highlighting the importance of incorporating syntactic structures as representational units.
Problem

Research questions and friction points this paper is trying to address.

Explores attention as memory retrieval model
Investigates Transformer Grammar's cognitive relevance
Assesses syntactic vs. token memory representations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer Grammar attention mechanism
Normalized Attention Entropy analysis
syntactic structures representation
🔎 Similar Papers
No similar papers found.