🤖 AI Summary
This study addresses the challenge of epileptic seizure prediction by proposing a Transformer-based multimodal feature fusion framework. The method integrates spatiotemporal features extracted from raw EEG signals via a CNN-LSTM architecture with time-frequency features derived from short-time Fourier transform (STFT) spectrograms using ResNet-18. Cross-modal fusion is achieved through a Transformer encoder, followed by a fully connected layer for seizure onset prediction. A key innovation lies in the joint utilization of raw signal and time-frequency representations, complemented by a target-adaptive fine-tuning strategy to enhance generalization in cross-patient scenarios. Evaluated on the CHB-MIT dataset, the model achieves an average recall of 98.85%, substantially outperforming existing approaches. Cross-patient experiments demonstrate that fine-tuning effectively improves recall, precision, and F1 score, while computational complexity is assessed across diverse hardware platforms.
📝 Abstract
Epilepsy is one of the most common neurological disorders globally, characterized by recurring seizures and significantly impacting the quality of life. Despite advancements in diagnostic techniques, the mitigation of risks faced by epilepsy patients remains challenging due to the unpredictability of seizure events. An accurate forecast of seizure onset helps to reduce risks in epilepsy patients. In this paper, we propose EEG-FuseFormer, a transformer-based feature fusion framework for seizure-onset prediction that combines intermediate features extracted from Convolutional Neural Networks-Long Short-Term Memory (CNN-LSTM) and ResNet-18 networks. The CNN-LSTM architecture captures both spatial and temporal features directly from the raw signal, whereas the ResNet-18 extracts features from the Short-Time Fourier Transform (STFT) representation of the EEG signals. Fusion is carried out using a transformer encoder, and the final prediction is generated using fully connected dense layers. The CHB-MIT dataset was used to validate the proposed model. The results show that the proposed model achieves a mean recall of 98.85% and outperforms most of the state-of-the-art methods. This study evaluates the ability of the proposed feature fusion model to generalize in cross-patient testing scenarios. Fine-tuning pre-trained models on limited target patient data (target adaptation) within the cross-patient validation framework results in higher recall, precision, and F1-score metrics in comparison to the conventional cross-patient validation approach. Finally, the runtime-based computational complexity of the model is assessed across diverse hardware platforms to highlight the performance-complexity trade-off.