Dynamical Label Augmentation and Calibration for Noisy Electronic Health Records

📅 2025-05-12

📈 Citations: 0

✨ Influential: 0

career value

164K/year

🤖 AI Summary

To address pervasive label noise in electronic health record (EHR) time-series prediction, this paper proposes ACTLL—an attention-driven framework for dynamic label calibration and augmentation. Methodologically, ACTLL introduces two key innovations: (1) a novel instance-wise certainty partitioning mechanism based on a two-component Beta mixture model, jointly optimizing dynamic label calibration and generative semantic augmentation; and (2) the first explicit decoupling of local noise modeling from global temporal representation learning in medical time-series modeling. By performing confidence-aware correction of uncertain samples and reinforcing semantic fidelity of high-confidence ones, ACTLL significantly enhances model robustness. Evaluated across multiple benchmarks—including eICU, MIMIC-IV-ED, and UCR/UEA—ACTLL achieves state-of-the-art performance, improving AUC by 5.2% under 40% label noise and outperforming existing robust learning approaches.

Technology Category

Application Category

📝 Abstract

Medical research, particularly in predicting patient outcomes, heavily relies on medical time series data extracted from Electronic Health Records (EHR), which provide extensive information on patient histories. Despite rigorous examination, labeling errors are inevitable and can significantly impede accurate predictions of patient outcome. To address this challenge, we propose an extbf{A}ttention-based Learning Framework with Dynamic extbf{C}alibration and Augmentation for extbf{T}ime series Noisy extbf{L}abel extbf{L}earning (ACTLL). This framework leverages a two-component Beta mixture model to identify the certain and uncertain sets of instances based on the fitness distribution of each class, and it captures global temporal dynamics while dynamically calibrating labels from the uncertain set or augmenting confident instances from the certain set. Experimental results on large-scale EHR datasets eICU and MIMIC-IV-ED, and several benchmark datasets from the UCR and UEA repositories, demonstrate that our model ACTLL has achieved state-of-the-art performance, especially under high noise levels.

Problem

Research questions and friction points this paper is trying to address.

Addresses labeling errors in EHR for patient outcome prediction

Proposes dynamic calibration and augmentation for noisy time series labels

Improves prediction accuracy under high noise levels in medical data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Attention-based learning framework for noisy labels

Dynamic calibration and label augmentation

Beta mixture model for instance classification

🔎 Similar Papers

No similar papers found.