DistilCLIP-EEG: Enhancing Epileptic Seizure Detection Through Multi-modal Learning and Knowledge Distillation

📅 2025-10-15

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing epilepsy detection methods predominantly rely on unimodal EEG signals, overlooking the synergistic potential of multimodal data. To address this limitation, this paper proposes the first multimodal framework integrating EEG and clinical text. Innovatively adapting the CLIP paradigm to epilepsy analysis, we design a Conformer-based EEG encoder and a learnable-prompt BERT (BERT-LP) text encoder, enabling cross-modal contrastive learning in a shared latent space. Additionally, we introduce a knowledge distillation strategy to compress the model while preserving diagnostic performance. Evaluated on three public benchmarks—TUSZ, AUBMC, and CHB-MIT—the framework achieves >97% accuracy and F1-scores exceeding 0.94. The distilled student model retains only 58.1% of the teacher’s parameters, significantly enhancing deployment efficiency. This work establishes a novel, lightweight paradigm for multimodal auxiliary diagnosis of epilepsy.

Technology Category

Application Category

📝 Abstract

Epilepsy is a prevalent neurological disorder marked by sudden, brief episodes of excessive neuronal activity caused by abnormal electrical discharges, which may lead to some mental disorders. Most existing deep learning methods for epilepsy detection rely solely on unimodal EEG signals, neglecting the potential benefits of multimodal information. To address this, we propose a novel multimodal model, DistilCLIP-EEG, based on the CLIP framework, which integrates both EEG signals and text descriptions to capture comprehensive features of epileptic seizures. The model involves an EEG encoder based on the Conformer architecture as a text encoder, the proposed Learnable BERT (BERT-LP) as prompt learning within the encoders. Both operate in a shared latent space for effective cross-modal representation learning. To enhance efficiency and adaptability, we introduce a knowledge distillation method where the trained DistilCLIP-EEG serves as a teacher to guide a more compact student model to reduce training complexity and time. On the TUSZ, AUBMC, and CHB-MIT datasets, both the teacher and student models achieved accuracy rates exceeding 97%. Across all datasets, the F1-scores were consistently above 0.94, demonstrating the robustness and reliability of the proposed framework. Moreover, the student model's parameter count and model size are approximately 58.1% of those of the teacher model, significantly reducing model complexity and storage requirements while maintaining high performance. These results highlight the potential of our proposed model for EEG-based epilepsy detection and establish a solid foundation for deploying lightweight models in resource-constrained settings.

Problem

Research questions and friction points this paper is trying to address.

Improving epileptic seizure detection using multimodal EEG and text data

Addressing limitations of unimodal deep learning methods for epilepsy

Reducing model complexity through knowledge distillation for efficient deployment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates EEG signals and text descriptions for multimodal learning

Uses knowledge distillation to create compact student model

Leverages Conformer and BERT encoders in shared latent space

🔎 Similar Papers

No similar papers found.

Authors to Follow