🤖 AI Summary
Current foundational models for EEG and intracranial EEG (iEEG) are constrained by limited scale, hindering performance gains. To address this, we construct the largest and most diverse electrophysiological dataset to date—59.3k hours of recordings from 17.7k subjects—and establish the first data-constrained scaling law for electrophysiology. Methodologically, we propose three key innovations: (1) any-variate attention to flexibly handle heterogeneous channel counts; (2) sliding temporal conditional positional encoding to capture dynamic temporal dependencies; and (3) multi-domain reconstruction for joint learning across signal, spectral, and time-frequency domains. Leveraging these, we train DIVER-1—a Transformer-based model—via large-scale distributed self-supervised pretraining. Evaluated on standard iEEG and EEG benchmarks, DIVER-1 achieves state-of-the-art performance. Furthermore, our ablation and scaling studies systematically uncover efficient scaling trajectories and principled resource allocation guidelines, providing both theoretical foundations and engineering blueprints for next-generation neural electrophysiology foundation models.
📝 Abstract
Electrophysiology signals such as EEG and iEEG are central to neuroscience, brain-computer interfaces, and clinical applications, yet existing foundation models remain limited in scale despite clear evidence that scaling improves performance. We introduce DIVER-1, a family of EEG and iEEG foundation models trained on the largest and most diverse corpus to date-5.3k hours of iEEG and 54k hours of EEG (1.6M channel-hours from over 17.7k subjects)-and scaled up to 1.82B parameters. We present the first systematic scaling law analysis for this domain, showing that they follow data-constrained scaling laws: for a given amount of data and compute, smaller models trained for extended epochs consistently outperform larger models trained briefly. This behavior contrasts with prior electrophysiology foundation models that emphasized model size over training duration. To achieve strong performance, we also design architectural innovations including any-variate attention, sliding temporal conditional positional encoding, and multi-domain reconstruction. DIVER-1 iEEG and EEG models each achieve state-of-the-art performance on their respective benchmarks, establishing a concrete guidelines for efficient scaling and resource allocation in electrophysiology foundation model development.