🤖 AI Summary
Traditional EEG machine learning relies on fixed-length segmentation, lacking neurophysiological justification. This work proposes CTXSEG—a statistically grounded, adaptive segmentation method that detects change points in EEG signals to generate biologically meaningful, variable-length segments. To rigorously validate its efficacy, we introduce CTXGEN, a synthetic data generation framework designed to emulate realistic neural dynamics and ground-truth segmentation boundaries. CTXSEG integrates seamlessly into standard ML pipelines without requiring modifications to downstream models. In epilepsy seizure detection, CTXSEG achieves statistically significant performance gains (p < 0.01) while reducing the number of segments required, consistently outperforming fixed-length baselines under standardized evaluation protocols. Its core innovation lies in modeling the nonstationarity of neural dynamics as the principled basis for segmentation—establishing, for the first time, a statistical, physiologically interpretable, and model-agnostic adaptive representation framework for EEG analysis.
📝 Abstract
Objective. Electroencephalography (EEG) data is derived by sampling continuous neurological time series signals. In order to prepare EEG signals for machine learning, the signal must be divided into manageable segments. The current naive approach uses arbitrary fixed time slices, which may have limited biological relevance because brain states are not confined to fixed intervals. We investigate whether adaptive segmentation methods are beneficial for machine learning EEG analysis.
Approach. We introduce a novel adaptive segmentation method, CTXSEG, that creates variable-length segments based on statistical differences in the EEG data and propose ways to use them with modern machine learning approaches that typically require fixed-length input. We assess CTXSEG using controllable synthetic data generated by our novel signal generator CTXGEN. While our CTXSEG method has general utility, we validate it on a real-world use case by applying it to an EEG seizure detection problem. We compare the performance of CTXSEG with fixed-length segmentation in the preprocessing step of a typical EEG machine learning pipeline for seizure detection.
Main results. We found that using CTXSEG to prepare EEG data improves seizure detection performance compared to fixed-length approaches when evaluated using a standardized framework, without modifying the machine learning method, and requires fewer segments.
Significance. This work demonstrates that adaptive segmentation with CTXSEG can be readily applied to modern machine learning approaches, with potential to improve performance. It is a promising alternative to fixed-length segmentation for signal preprocessing and should be considered as part of the standard preprocessing repertoire in EEG machine learning applications.