DistilCLIP-EEG: Enhancing Epileptic Seizure Detection Through Multi-modal Learning and Knowledge Distillation

📅 2025-10-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing epilepsy detection methods predominantly rely on unimodal EEG signals, overlooking the synergistic potential of multimodal data. To address this limitation, this paper proposes the first multimodal framework integrating EEG and clinical text. Innovatively adapting the CLIP paradigm to epilepsy analysis, we design a Conformer-based EEG encoder and a learnable-prompt BERT (BERT-LP) text encoder, enabling cross-modal contrastive learning in a shared latent space. Additionally, we introduce a knowledge distillation strategy to compress the model while preserving diagnostic performance. Evaluated on three public benchmarks—TUSZ, AUBMC, and CHB-MIT—the framework achieves >97% accuracy and F1-scores exceeding 0.94. The distilled student model retains only 58.1% of the teacher’s parameters, significantly enhancing deployment efficiency. This work establishes a novel, lightweight paradigm for multimodal auxiliary diagnosis of epilepsy.

Technology Category

Application Category

📝 Abstract
Epilepsy is a prevalent neurological disorder marked by sudden, brief episodes of excessive neuronal activity caused by abnormal electrical discharges, which may lead to some mental disorders. Most existing deep learning methods for epilepsy detection rely solely on unimodal EEG signals, neglecting the potential benefits of multimodal information. To address this, we propose a novel multimodal model, DistilCLIP-EEG, based on the CLIP framework, which integrates both EEG signals and text descriptions to capture comprehensive features of epileptic seizures. The model involves an EEG encoder based on the Conformer architecture as a text encoder, the proposed Learnable BERT (BERT-LP) as prompt learning within the encoders. Both operate in a shared latent space for effective cross-modal representation learning. To enhance efficiency and adaptability, we introduce a knowledge distillation method where the trained DistilCLIP-EEG serves as a teacher to guide a more compact student model to reduce training complexity and time. On the TUSZ, AUBMC, and CHB-MIT datasets, both the teacher and student models achieved accuracy rates exceeding 97%. Across all datasets, the F1-scores were consistently above 0.94, demonstrating the robustness and reliability of the proposed framework. Moreover, the student model's parameter count and model size are approximately 58.1% of those of the teacher model, significantly reducing model complexity and storage requirements while maintaining high performance. These results highlight the potential of our proposed model for EEG-based epilepsy detection and establish a solid foundation for deploying lightweight models in resource-constrained settings.
Problem

Research questions and friction points this paper is trying to address.

Improving epileptic seizure detection using multimodal EEG and text data
Addressing limitations of unimodal deep learning methods for epilepsy
Reducing model complexity through knowledge distillation for efficient deployment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates EEG signals and text descriptions for multimodal learning
Uses knowledge distillation to create compact student model
Leverages Conformer and BERT encoders in shared latent space
🔎 Similar Papers
No similar papers found.
Z
Zexin Wang
Aliyun School of Big Data, Changzhou University, China
Lin Shi
Lin Shi
Beihang University
Software Engineering
H
Haoyu Wu
Department of Computing, Xi’an JiaoTong-Liverpool University, China, and also with the Department of Computer Science, University of Liverpool, UK
J
Junru Luo
Aliyun School of Big Data, Changzhou University, China
X
Xiangzeng Kong
Center for Artificial Intelligence in Agriculture, Fujian Agriculture and Forestry University, China
J
Jun Qi
Department of Computing, Xi’an JiaoTong-Liverpool University, China