Neuroprobe: Evaluating Intracranial Brain Responses to Naturalistic Stimuli

📅 2025-09-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
A standardized evaluation framework for intracranial electroencephalography (iEEG)-based language decoding is currently lacking, hindering fair model comparison and mechanistic interpretation of neural language processing. To address this, we introduce the first multimodal iEEG language processing decoding benchmark suite, built upon the high spatiotemporal-resolution BrainTreebank dataset. It incorporates naturalistic movie stimuli, time-resolved feature decoding, and systematic comparison of linear and deep learning models. Our key contribution is the first dedicated iEEG evaluation framework, which reveals the dynamic propagation of linguistic information from auditory to prefrontal cortices. Empirically, simple linear baselines outperform state-of-the-art neural foundation models across multiple decoding tasks. We publicly release the codebase and host a live leaderboard to foster reproducible, iEEG-driven research in neural language computation.

Technology Category

Application Category

📝 Abstract
High-resolution neural datasets enable foundation models for the next generation of brain-computer interfaces and neurological treatments. The community requires rigorous benchmarks to discriminate between competing modeling approaches, yet no standardized evaluation frameworks exist for intracranial EEG (iEEG) recordings. To address this gap, we present Neuroprobe: a suite of decoding tasks for studying multi-modal language processing in the brain. Unlike scalp EEG, intracranial EEG requires invasive surgery to implant electrodes that record neural activity directly from the brain with minimal signal distortion. Neuroprobe is built on the BrainTreebank dataset, which consists of 40 hours of iEEG recordings from 10 human subjects performing a naturalistic movie viewing task. Neuroprobe serves two critical functions. First, it is a mine from which neuroscience insights can be drawn. Its high temporal and spatial resolution allows researchers to systematically determine when and where computations for each aspect of language processing occur in the brain by measuring the decodability of each feature across time and all electrode locations. Using Neuroprobe, we visualize how information flows from the superior temporal gyrus to the prefrontal cortex, and the progression from simple auditory features to more complex language features in a purely data-driven manner. Second, as the field moves toward neural foundation models, Neuroprobe provides a rigorous framework for comparing competing architectures and training protocols. We found that the linear baseline is surprisingly strong, beating frontier foundation models on many tasks. Neuroprobe is designed with computational efficiency and ease of use in mind. We make the code for Neuroprobe openly available and maintain a public leaderboard, aiming to enable rapid progress in the field of iEEG foundation models, at https://neuroprobe.dev/
Problem

Research questions and friction points this paper is trying to address.

Lack of standardized evaluation frameworks for intracranial EEG recordings
Need to decode multi-modal language processing in the brain
Require benchmarks to compare competing neural foundation models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decoding tasks for multi-modal language processing
Framework comparing neural foundation model architectures
Public leaderboard for iEEG model evaluation