Pre-Training Graph Contrastive Masked Autoencoders are Strong Distillers for EEG

📅 2024-11-28

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

This study addresses the challenge of transferring knowledge from high-density, unlabeled EEG data to low-density, few-shot EEG classification tasks. Methodologically, we propose a unified graph-based contrastive masked autoencoding distillation framework. We introduce, for the first time, a joint pretraining paradigm integrating graph contrastive learning and graph masked autoencoding; design a graph-topological distillation loss to enable structure-aware knowledge transfer under electrode dropout; and establish a teacher–student collaborative contrastive pretraining mechanism. The approach synergistically combines graph neural networks, contrastive learning, masked modeling, knowledge distillation, and graph topological modeling. Evaluated on four clinical EEG classification tasks across two real-world datasets, our method consistently outperforms state-of-the-art baselines—achieving higher classification accuracy while reducing computational overhead, thus balancing performance and efficiency.

Technology Category

Application Category

📝 Abstract

Effectively utilizing extensive unlabeled high-density EEG data to improve performance in scenarios with limited labeled low-density EEG data presents a significant challenge. In this paper, we address this by framing it as a graph transfer learning and knowledge distillation problem. We propose a Unified Pre-trained Graph Contrastive Masked Autoencoder Distiller, named EEG-DisGCMAE, to bridge the gap between unlabeled/labeled and high/low-density EEG data. To fully leverage the abundant unlabeled EEG data, we introduce a novel unified graph self-supervised pre-training paradigm, which seamlessly integrates Graph Contrastive Pre-training and Graph Masked Autoencoder Pre-training. This approach synergistically combines contrastive and generative pre-training techniques by reconstructing contrastive samples and contrasting the reconstructions. For knowledge distillation from high-density to low-density EEG data, we propose a Graph Topology Distillation loss function, allowing a lightweight student model trained on low-density data to learn from a teacher model trained on high-density data, effectively handling missing electrodes through contrastive distillation. To integrate transfer learning and distillation, we jointly pre-train the teacher and student models by contrasting their queries and keys during pre-training, enabling robust distillers for downstream tasks. We demonstrate the effectiveness of our method on four classification tasks across two clinical EEG datasets with abundant unlabeled data and limited labeled data. The experimental results show that our approach significantly outperforms contemporary methods in both efficiency and accuracy.

Problem

Research questions and friction points this paper is trying to address.

Bridging unlabeled and labeled EEG data gap

Handling missing electrodes via contrastive distillation

Improving low-density EEG performance with pre-training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified graph self-supervised pre-training paradigm

Graph topology distillation loss function

Contrastive distillation for missing electrodes

🔎 Similar Papers

No similar papers found.

Authors to Follow