🤖 AI Summary
This work proposes the first general-purpose EEG-to-text foundation model capable of generating clinically meaningful and interpretable narratives from electroencephalography (EEG) signals. Addressing the limitations of existing methods—which are often confined to specific tasks or coarse-grained recognition—the study introduces NeuroCorpus-160K, a standardized EEG–text paired corpus comprising 160,000 samples. The model integrates spectral-spatial contrastive learning, state-space temporal modeling, and conditional generation with large language models to jointly encode EEG topographic maps and time-series dynamics. Experimental results demonstrate that the framework effectively captures spatiotemporal–spectral dynamics, significantly improving the accuracy and interpretability of clinical narrative generation across multiple benchmarks and zero-shot transfer tasks.
📝 Abstract
Electroencephalography (EEG) provides a non-invasive window into neural dynamics at high temporal resolution and plays a pivotal role in clinical neuroscience research. Despite this potential, prevailing computational approaches to EEG analysis remain largely confined to task-specific classification objectives or coarse-grained pattern recognition, offering limited support for clinically meaningful interpretation. To address these limitations, we introduce NeuroNarrator, the first generalist EEG-to-text foundation model designed to translate electrophysiological segments into precise clinical narratives. A cornerstone of this framework is the curation of NeuroCorpus-160K, the first harmonized largescale resource pairing over 160,000 EEG segments with structured, clinically grounded natural-language descriptions. Our architecture first aligns temporal EEG waveforms with spatial topographic maps via a rigorous contrastive objective, establishing spectro–spatially grounded representations. Building on this grounding, we condition a Large Language Model through a state-space–inspired formulation that integrates historical temporal and spectral context to support coherent clinical narrative generation. This approach establishes a principled bridge between continuous signal dynamics and discrete clinical language, enabling interpretable narrative generation that facilitates expert interpretation and supports clinical reporting workflows. Extensive evaluations across diverse benchmarks and zero-shot transfer tasks highlight NeuroNarrator’s capacity to integrate temporal, spectral, and spatial dynamics, positioning it as a foundational framework for time–frequency–aware, open-ended clinical interpretation of electrophysiological data.