🤖 AI Summary
This study addresses the challenge of early in-hospital stroke detection, which is hindered by the lack of continuous pre-event physiological signals. Focusing on hospitalized patients already under continuous monitoring, the authors precisely pinpoint stroke onset times by integrating clinical notes—extracted with large language model (LLM) assistance and validated by physicians—and extract hemodynamic features from photoplethysmography (PPG) signals. A ResNet-1D temporal model is developed to predict stroke risk across multiple time windows. The work demonstrates, for the first time in real-world datasets (MIMIC-III and MC-MED), that PPG signals exhibit high predictive power 4–6 hours before stroke onset, achieving F1 scores of 0.9406 and 0.9888, respectively. These findings advance stroke detection from reactive clinical response toward proactive physiological monitoring.
📝 Abstract
The absence of pre-hospital physiological data in standard clinical datasets fundamentally constrains the early prediction of stroke, as patients typically present only after stroke has occurred, leaving the predictive value of continuous monitoring signals such as photoplethysmography (PPG) unvalidated. In this work, we overcome this limitation by focusing on a rare but clinically critical cohort - patients who suffered stroke during hospitalization while already under continuous monitoring - thereby enabling the first large-scale analysis of pre-stroke PPG waveforms aligned to verified onset times. Using MIMIC-III and MC-MED, we develop an LLM-assisted data mining pipeline to extract precise in-hospital stroke onset timestamps from unstructured clinical notes, followed by physician validation, identifying 176 patients (MIMIC) and 158 patients (MC-MED) with high-quality synchronized pre-onset PPG data, respectively. We then extract hemodynamic features from PPG and employ a ResNet-1D model to predict impending stroke across multiple early-warning horizons. The model achieves F1-scores of 0.7956, 0.8759, and 0.9406 at 4, 5, and 6 hours prior to onset on MIMIC-III, and, without re-tuning, reaches 0.9256, 0.9595, and 0.9888 on MC-MED for the same horizons. These results provide the first empirical evidence from real-world clinical data that PPG contains predictive signatures of stroke several hours before onset, demonstrating that passively acquired physiological signals can support reliable early warning, supporting a shift from post-event stroke recognition to proactive, physiology-based surveillance that may materially improve patient outcomes in routine clinical care.