Pre-Training Graph Contrastive Masked Autoencoders are Strong Distillers for EEG

📅 2024-11-28
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of transferring knowledge from high-density, unlabeled EEG data to low-density, few-shot EEG classification tasks. Methodologically, we propose a unified graph-based contrastive masked autoencoding distillation framework. We introduce, for the first time, a joint pretraining paradigm integrating graph contrastive learning and graph masked autoencoding; design a graph-topological distillation loss to enable structure-aware knowledge transfer under electrode dropout; and establish a teacher–student collaborative contrastive pretraining mechanism. The approach synergistically combines graph neural networks, contrastive learning, masked modeling, knowledge distillation, and graph topological modeling. Evaluated on four clinical EEG classification tasks across two real-world datasets, our method consistently outperforms state-of-the-art baselines—achieving higher classification accuracy while reducing computational overhead, thus balancing performance and efficiency.

Technology Category

Application Category

📝 Abstract
Effectively utilizing extensive unlabeled high-density EEG data to improve performance in scenarios with limited labeled low-density EEG data presents a significant challenge. In this paper, we address this by framing it as a graph transfer learning and knowledge distillation problem. We propose a Unified Pre-trained Graph Contrastive Masked Autoencoder Distiller, named EEG-DisGCMAE, to bridge the gap between unlabeled/labeled and high/low-density EEG data. To fully leverage the abundant unlabeled EEG data, we introduce a novel unified graph self-supervised pre-training paradigm, which seamlessly integrates Graph Contrastive Pre-training and Graph Masked Autoencoder Pre-training. This approach synergistically combines contrastive and generative pre-training techniques by reconstructing contrastive samples and contrasting the reconstructions. For knowledge distillation from high-density to low-density EEG data, we propose a Graph Topology Distillation loss function, allowing a lightweight student model trained on low-density data to learn from a teacher model trained on high-density data, effectively handling missing electrodes through contrastive distillation. To integrate transfer learning and distillation, we jointly pre-train the teacher and student models by contrasting their queries and keys during pre-training, enabling robust distillers for downstream tasks. We demonstrate the effectiveness of our method on four classification tasks across two clinical EEG datasets with abundant unlabeled data and limited labeled data. The experimental results show that our approach significantly outperforms contemporary methods in both efficiency and accuracy.
Problem

Research questions and friction points this paper is trying to address.

Bridging unlabeled and labeled EEG data gap
Handling missing electrodes via contrastive distillation
Improving low-density EEG performance with pre-training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified graph self-supervised pre-training paradigm
Graph topology distillation loss function
Contrastive distillation for missing electrodes
🔎 Similar Papers
No similar papers found.
Xinxu Wei
Xinxu Wei
Lehigh University
Machine LearningAIFoundation ModelPre-TrainingGraph Neural Networks
K
Kanhao Zhao
Department of Bioengineering, Lehigh University, Bethlehem, PA, USA
Y
Yong Jiao
Department of Bioengineering, Lehigh University, Bethlehem, PA, USA
Nancy B. Carlisle
Nancy B. Carlisle
Department of Psychology, Lehigh University, Bethlehem, PA, USA
Hua Xie
Hua Xie
Center for Neuroscience Research, Children’s National Hospital, Washington, DC, USA
Y
Yu Zhang
Department of Electrical and Computer Engineering, Lehigh University, Bethlehem, PA, USA; Department of Bioengineering, Lehigh University, Bethlehem, PA, USA