PRISM: Exploring Heterogeneous Pretrained EEG Foundation Model Transfer to Clinical Differential Diagnosis

📅 2026-02-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the limited generalizability of existing EEG foundation models, which are typically pretrained on single-source clinical data and struggle to disentangle neurophysiological signals from device- or site-specific artifacts. To overcome this, the authors construct a multi-center EEG pretraining corpus with deliberate geographic and device diversity and employ a unified masked autoencoder (MAE) architecture with consistent preprocessing. They systematically investigate how data heterogeneity influences transfer performance on downstream clinical tasks, revealing a critical trade-off between data diversity and model transferability. Their findings demonstrate that strategically curated diverse data outperforms indiscriminate scale expansion, achieving a 12.3-percentage-point gain in balanced accuracy on an unseen seizure-versus-mimics discrimination task—matching or surpassing the REVE model trained on 92 datasets—and identify six key non-additive bias factors that critically impact evaluation.

Technology Category

Application Category

📝 Abstract
EEG foundation models are typically pretrained on narrow-source clinical archives and evaluated on benchmarks from the same ecosystem, leaving unclear whether representations encode neural physiology or recording-distribution artifacts. We introduce PRISM (Population Representative Invariant Signal Model), a masked autoencoder ablated along two axes -- pretraining population and downstream adaptation -- with architecture and preprocessing fixed. We compare a narrow-source EU/US corpus (TUH + PhysioNet) against a geographically diverse pool augmented with multi-center South Asian clinical recordings across multiple EEG systems. Three findings emerge. First, narrow-source pretraining yields stronger linear probes on distribution-matched benchmarks, while diverse pretraining produces more adaptable representations under fine-tuning -- a trade-off invisible under single-protocol evaluation. Trained on three source corpora, PRISM matches or outperforms REVE (92 datasets, 60,000+ hours) on the majority of tasks, demonstrating that targeted diversity can substitute for indiscriminate scale and that dataset count is a confounding variable in model comparison. Second, on a clinically challenging and previously untested task -- distinguishing epilepsy from diagnostic mimickers via interictal EEG -- the diverse checkpoint outperforms the narrow-source checkpoint by +12.3 pp balanced accuracy, the largest gap across all evaluations. Third, systematic inconsistencies between EEG-Bench and EEG-FM-Bench reverse model rankings on identical datasets by up to 24 pp; we identify six concrete sources including split construction, checkpoint selection, segment length, and normalization, showing these factors compound non-additively.
Problem

Research questions and friction points this paper is trying to address.

EEG foundation model
clinical differential diagnosis
distribution shift
benchmark inconsistency
epilepsy mimickers
Innovation

Methods, ideas, or system contributions that make the work stand out.

EEG foundation model
pretraining diversity
clinical transfer learning
benchmark inconsistency
masked autoencoder
🔎 Similar Papers
No similar papers found.
J
Jeet Bandhu Lahiri
Indian Institute of Technology Mandi, India
P
Parshva Runwal
NeuroDx, India
A
Arvasu Kulkarni
NeuroDx, India
M
Mahir Jain
NeuroDx, India
A
Aditya Ray Mishra
NeuroDx, India
S
Siddharth Panwar
Indian Institute of Technology Mandi, India; NeuroDx, India
Sandeep Singh
Sandeep Singh
Assistant Professor, IIT Roorkee, India
Communication and Optical NetworksQuantum opticsDatacenterMachine LearningStochastic