LEAD: Large Foundation Model for EEG-Based Alzheimer's Disease Detection

📅 2025-02-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the dual challenges of scarce large-scale labeled data and substantial inter-subject variability in Alzheimer’s disease (AD) detection from electroencephalography (EEG), this work introduces the largest publicly available EEG-AD dataset to date (813 subjects) and proposes the first EEG-oriented large foundation model for AD detection. Methodologically, we establish the first dedicated large-model architecture for EEG; design a dual-granularity self-supervised contrastive pretraining framework—operating at both sample-level and subject-level; incorporate a spatiotemporal joint embedding encoder to jointly model temporal dynamics and channel-wise topological structure; and employ cross-dataset channel alignment fine-tuning with subject-independent cross-validation. The resulting end-to-end AD classifier achieves F1-score improvements of 9.86% (sample-level) and 9.31% (subject-level) over state-of-the-art methods, demonstrating the efficacy of contrastive pretraining and channel alignment in mitigating inter-subject heterogeneity.

Technology Category

Application Category

📝 Abstract
Electroencephalogram (EEG) provides a non-invasive, highly accessible, and cost-effective solution for Alzheimer's Disease (AD) detection. However, existing methods, whether based on manual feature extraction or deep learning, face two major challenges: the lack of large-scale datasets for robust feature learning and evaluation, and poor detection performance due to inter-subject variations. To address these challenges, we curate an EEG-AD corpus containing 813 subjects, which forms the world's largest EEG-AD dataset to the best of our knowledge. Using this unique dataset, we propose LEAD, the first large foundation model for EEG-based AD detection. Our method encompasses an entire pipeline, from data selection and preprocessing to self-supervised contrastive pretraining, fine-tuning, and key setups such as subject-independent evaluation and majority voting for subject-level detection. We pre-train the model on 11 EEG datasets and unified fine-tune it on 5 AD datasets. Our self-supervised pre-training design includes sample-level and subject-level contrasting to extract useful general EEG features. Fine-tuning is performed on 5 channel-aligned datasets together. The backbone encoder incorporates temporal and channel embeddings to capture features across both temporal and spatial dimensions. Our method demonstrates outstanding AD detection performance, achieving up to a 9.86% increase in F1 score at the sample-level and up to a 9.31% at the subject-level compared to state-of-the-art methods. The results of our model strongly confirm the effectiveness of contrastive pre-training and channel-aligned unified fine-tuning for addressing inter-subject variation. The source code is at https://github.com/DL4mHealth/LEAD.
Problem

Research questions and friction points this paper is trying to address.

Detects Alzheimer's Disease using EEG data
Addresses lack of large EEG datasets for AD
Improves detection accuracy across different subjects
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large EEG-AD dataset with 813 subjects
Self-supervised contrastive pre-training design
Channel-aligned unified fine-tuning for AD detection
🔎 Similar Papers
No similar papers found.
Yihe Wang
Yihe Wang
University of North Carolina at Charlotte
Deep LearningEEGBCIMedical Time-seriesFoundation Model
N
Nan Huang
Department of Computer Science, University of North Carolina at Charlotte, United States
Nadia Mammone
Nadia Mammone
University Mediterranea of Reggio Calabria
M
Marco Cecchi
Cognision, Kentucky, United States
X
Xiang Zhang
Department of Computer Science, University of North Carolina at Charlotte, United States