🤖 AI Summary
This work addresses the challenges of erroneous pseudo-labeling and overconfident predictions in semi-supervised electrocardiogram (ECG) classification caused by out-of-distribution (OOD) samples in unlabeled data. To mitigate these issues, the authors propose a calibration-aware safe semi-supervised learning framework that employs a time-frequency dual-branch network to extract ECG-specific representations. The framework jointly optimizes a multi-class classifier and an OOD detector, while introducing adaptive label smoothing and temperature scaling in the joint time-frequency space to dynamically calibrate prediction confidence. Coupled with ECG-tailored augmentation strategies, the proposed method achieves state-of-the-art classification accuracy and calibration performance on the PTB-XL and PhysioNet/CinC Challenge benchmarks, significantly enhancing reliable knowledge discovery in open-set ECG analysis.
📝 Abstract
Electrocardiogram (ECG) classification models often suffer from severe label scarcity, making semi-supervised learning (SSL) an attractive strategy for reducing annotation costs. In clinical settings, however, unlabeled pools frequently contain out-of-distribution (OOD) anomalies or diagnostic groups absent from the labeled set. Standard SSL forces incorrect pseudo-labels onto these unseen classes, producing overconfident predictions. To address this, we propose SafeECGMatch, a calibration-aware safe SSL framework for single-label ECG classification under label distribution mismatch. Methodologically, SafeECGMatch employs a dual-branch architecture extracting time-frequency latent representations via ECG-specific augmentations. Crucially, it dynamically aligns confidence with empirical accuracy through adaptive label smoothing and temperature scaling, calibrating both the multiclass classifier and the OOD detector across temporal and spectral domains. This joint optimization allows trustworthy OOD rejection and reliable pseudo-labeling. Evaluated on the PTB-XL and PhysioNet/CinC Challenge benchmarks, SafeECGMatch achieves state-of-the-art accuracy and calibration, advancing reliable knowledge discovery in physiological time-series. Code is available at https://github.com/labhai/SafeECGMatch.