OmniEEG-Bench: A Standardized Evaluation Benchmark for EEG Foundation Models

📅 2026-05-30
📈 Citations: 0
Influential: 0
📄 PDF

career value

220K/year
🤖 AI Summary
This work addresses the lack of comparability among existing EEG foundation models, which stems from heterogeneous datasets and inconsistent task protocols. To this end, we introduce OmniEEG-Bench, the first unified evaluation benchmark encompassing six task families: signal reliability, biometrics, states of consciousness, cognitive-affective processing, naturalistic stimulus decoding, and motor interaction. The benchmark integrates 54 datasets and standardizes evaluation through uniform task cards and preprocessing pipelines. Systematic assessment of ten representative models reveals that both the diversity of pretraining data and model scale significantly enhance cross-dataset generalization. Notably, our analysis uncovers, for the first time, a scaling law in EEG foundation models, offering empirical guidance for future architectural design and pretraining strategies.
📝 Abstract
Electroencephalography (EEG) supports a variety of brain-computer interface (BCI) tasks ranging from brain-state monitoring to human-LLM interactions. EEG foundation models are emerging, but evaluation remains fragmented due to heterogeneous datasets and nconsistent task protocols. Here, we introduce OmniEEG-Bench, a unified benchmark and downstream task roadmap for EEG foundation models (FMs). It organizes evaluation of EEG FMs into six task families spanning (i) signal reliability, (ii) biometrics and disease, (iii) consciousness and state, (iv) cognition and emotion, (v) naturalistic stimulus decoding, and (vi) motor and interaction, introducing a new generation of tasks not systematically benchmarked in prior EEG FM work. OmniEEG-Bench standardizes model deployment, task definitions, and metrics through a task-card specification, and unifies 54 EEG datasets with consistent evaluation protocols. We benchmark 10 representative EEG foundation models and report a leaderboard that covers diverse evaluation settings. Both pretraining dataset diversity and model size are significantly associated with better average ranks across datasets, revealing scaling-law behavior in EEG foundation models (Figure 1). These results suggest that scaling EEG foundation models requires not only larger architectures but also broader and more diverse pretraining data. The benchmark code is available at https://github.com/ncclab-sustech/omni-eegbench.git.
Problem

Research questions and friction points this paper is trying to address.

EEG foundation models
evaluation benchmark
standardized evaluation
heterogeneous datasets
task protocols
Innovation

Methods, ideas, or system contributions that make the work stand out.

EEG foundation models
standardized benchmark
scaling law
task-card specification
unified evaluation
💼 Related Jobs
Z
Ziling Lu
Department of Biomedical Engineering, Southern University of Science and Technology, Shenzhen, China
Z
Zongsheng Li
School of Computer Science and Engineering, The Chinese University of Hong Kong, Shenzhen, China; Department of Biomedical Engineering, Southern University of Science and Technology, Shenzhen, China
Xinke Shen
Xinke Shen
Southern University of Science and Technology
Affective Brain Computer Interface
Kexin Lou
Kexin Lou
School of Electrical Engineering and Computer Science, University of Queensland
Y
Yingyue Xin
Department of Biomedical Engineering, Southern University of Science and Technology, Shenzhen, China
Xiaoqi Chen
Xiaoqi Chen
Purdue University
Software Defined NetworkingProgrammable Data Plane
Shinan Wang
Shinan Wang
Wayne State University
Computer SystemsEnergy-Efficient ComputingSensor NetworksApplication Power Management
X
Xiang Chen
Department of Biomedical Engineering, Southern University of Science and Technology, Shenzhen, China
J
Jiahao Fan
Department of Biomedical Engineering, Southern University of Science and Technology, Shenzhen, China
C
Chenyu Huang
Department of Biomedical Engineering, Southern University of Science and Technology, Shenzhen, China
Xin Xu
Xin Xu
Professor of Wuhan University of Science and Technology
Person re-identificationLow-light image processingSalient object detection
Z
Zhoujie Hou
Department of Biomedical Engineering, Southern University of Science and Technology, Shenzhen, China
C
Chen Wei
Department of Biomedical Engineering, Southern University of Science and Technology, Shenzhen, China; Omni-Intelligence, Shenzhen, China
Q
Quanying Liu
Department of Biomedical Engineering, Southern University of Science and Technology, Shenzhen, China; Omni-Intelligence, Shenzhen, China; Shenzhen Loop Area Institute, Shenzhen, China