🤖 AI Summary
This study investigates the temporal alignment between large language models (LLMs) and human brain dynamics during language processing: specifically, whether shallow LLM layers correspond to early neural responses and deep layers to late responses, and whether this alignment is jointly modulated by model scale and context length. Using high-temporal-resolution EEG data acquired during an auditory story-listening paradigm, we systematically evaluate 22 LLMs spanning diverse architectures (Transformer and recurrent) and parameter counts. We integrate neural dynamical modeling with cross-modal representational alignment analysis. Our key finding is the first demonstration of robust temporal correspondence between LLM layer depth and cortical response latency—shallow layers align with early neural activity (~100–300 ms post-stimulus), while deeper layers align with later activity (~400–800 ms). Critically, alignment strength is jointly determined by model parameter count and context window length, and this pattern holds consistently across architectures. These results reveal convergent computational principles underlying sequential language processing in artificial and biological neural networks.
📝 Abstract
Recent studies suggest that the representations learned by large language models (LLMs) are partially aligned to those of the human brain. However, whether and why this alignment score arises from a similar sequence of computations remains elusive. In this study, we explore this question by examining temporally-resolved brain signals of participants listening to 10 hours of an audiobook. We study these neural dynamics jointly with a benchmark encompassing 22 LLMs varying in size and architecture type. Our analyses confirm that LLMs and the brain generate representations in a similar order: specifically, activations in the initial layers of LLMs tend to best align with early brain responses, while the deeper layers of LLMs tend to best align with later brain responses. This brain-LLM alignment is consistent across transformers and recurrent architectures. However, its emergence depends on both model size and context length. Overall, this study sheds light on the sequential nature of computations and the factors underlying the partial convergence between biological and artificial neural networks.