🤖 AI Summary
Reconstructing high-fidelity electrocardiogram (ECG) signals from photoplethysmography (PPG) faces challenges including difficulty in fine-grained waveform modeling and low fidelity of clinically relevant features. To address these, we propose a vision-based reconstruction framework that encodes single-point PPG sequences into four-channel images—comprising the raw signal, first- and second-order derivatives, and area-under-curve—and leverages a Vision Transformer (ViT) to jointly model both intra-beat and inter-beat long-range dependencies. Crucially, our architecture incorporates physiology-informed feature design to preserve clinical interpretability. Evaluated on standard benchmarks, our method achieves a 29% reduction in Percent Root-mean-square Difference (PRD) and a 15% reduction in Root Mean Square Error (RMSE) over 1D convolutional baselines. Moreover, errors in key clinical metrics—including QRS complex area, PR/RT intervals, and amplitude—show significant improvement, demonstrating superior fine-grained ECG morphology reconstruction and clinical applicability.
📝 Abstract
Reconstructing ECG from PPG is a promising yet challenging task. While recent advancements in generative models have significantly improved ECG reconstruction, accurately capturing fine-grained waveform features remains a key challenge. To address this, we propose a novel PPG-to-ECG reconstruction method that leverages a Vision Transformer (ViT) as the core network. Unlike conventional approaches that rely on single-channel PPG, our method employs a four-channel signal image representation, incorporating the original PPG, its first-order difference, second-order difference, and area under the curve. This multi-channel design enriches feature extraction by preserving both temporal and physiological variations within the PPG. By leveraging the self-attention mechanism in ViT, our approach effectively captures both inter-beat and intra-beat dependencies, leading to more robust and accurate ECG reconstruction. Experimental results demonstrate that our method consistently outperforms existing 1D convolution-based approaches, achieving up to 29% reduction in PRD and 15% reduction in RMSE. The proposed approach also produces improvements in other evaluation metrics, highlighting its robustness and effectiveness in reconstructing ECG signals. Furthermore, to ensure a clinically relevant evaluation, we introduce new performance metrics, including QRS area error, PR interval error, RT interval error, and RT amplitude difference error. Our findings suggest that integrating a four-channel signal image representation with the self-attention mechanism of ViT enables more effective extraction of informative PPG features and improved modeling of beat-to-beat variations for PPG-to-ECG mapping. Beyond demonstrating the potential of PPG as a viable alternative for heart activity monitoring, our approach opens new avenues for cyclic signal analysis and prediction.