A Comprehensive Inference-Time Augmentation Framework in Physiological Signals: Application to PPG-Based AF Detection

📅 2026-06-09

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the degradation in classification performance of physiological signal models under real-world deployment conditions caused by sensor noise, motion artifacts, and train-deployment distribution shifts. To enhance robustness without requiring model retraining, the authors propose a unified test-time augmentation framework that systematically integrates 13 cross-domain augmentation strategies spanning the time, amplitude, and frequency domains, along with artifact injection. Coupled with Bayesian optimization for automatic hyperparameter tuning, this approach yields a general, model-agnostic enhancement scheme. Evaluated on PPG-based atrial fibrillation detection, the method achieves up to an 8.5% improvement in AUROC and a 10.6% gain in AUPRC, while selective augmentation reduces the false positive rate on non-atrial fibrillation samples by 4.4%.

📝 Abstract

Objective: Accurate classification of physiological signals in real-world deployments is challenged by sensor noise, motion artifacts, and distribution shifts between training and deployment data. Inference-time augmentation (ITA), which applies augmentations during inference rather than retraining, offers a simple, model-agnostic mechanism to improve robustness. However, ITA application to physiological signals has remained narrow in scope, relying on limited augmentation methods with fixed, unoptimized parameters. This work proposes a unified ITA framework to address that gap. Approach: The framework incorporates 13 augmentation methods spanning time-domain, amplitude-domain, frequency-domain, and artifact-injection transformations, with hyperparameters optimized via Bayesian optimization. We evaluate on atrial fibrillation (AF) detection from 30-second PPG signals using GPT-PPG and ResNet across five datasets comprising more than 400 patients and ${\sim}$9,800 hours of recording. Main results: Standard ITA consistently improved AUROC (up to 8.5% for GPT-PPG and 0.7% for ResNet) and AUPRC (up to 10.6% for GPT-PPG and 0.8% for ResNet). Selective ITA further reduced average FPR by up to 4.4% (GPT-PPG) and 1.3% (ResNet) on non-AF datasets. Significance: These findings establish ITA as a practical, model-agnostic approach for improving PPG-based AF classification reliability in deployment settings where retraining is not feasible, with broader applicability to physiological signal analysis.

Problem

Research questions and friction points this paper is trying to address.

physiological signals

inference-time augmentation

distribution shift

motion artifacts

PPG-based AF detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Inference-Time Augmentation

Physiological Signal Robustness

Bayesian Optimization