Closing Gaps: An Imputation Analysis of ICU Vital Signs

📅 2025-10-28

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Missing physiological measurements (e.g., heart rate) in ICU time-series data severely degrade the performance of clinical predictive models, yet existing imputation methods lack systematic, task-oriented evaluation. Method: We establish a scalable, reusable benchmark framework to comprehensively compare 15 time-series imputation methods—including mean imputation, linear interpolation, last-observation-carried-forward (LOCF), Kalman filtering, and deep learning models—alongside 4 censoring strategies, across major ICU datasets. Using controlled missingness simulations, we quantify how each method affects downstream prediction tasks. Contribution/Results: Our empirical analysis reveals that optimal imputation significantly improves model accuracy, whereas common heuristics (e.g., zero imputation) introduce substantial bias and performance degradation. The study delivers an evidence-based, clinically informed guide for selecting imputation strategies, advancing standardization and reliability in preprocessing temporal healthcare data for machine learning.

Technology Category

Application Category

📝 Abstract

As more Intensive Care Unit (ICU) data becomes available, the interest in developing clinical prediction models to improve healthcare protocols increases. However, the lack of data quality still hinders clinical prediction using Machine Learning (ML). Many vital sign measurements, such as heart rate, contain sizeable missing segments, leaving gaps in the data that could negatively impact prediction performance. Previous works have introduced numerous time-series imputation techniques. Nevertheless, more comprehensive work is needed to compare a representative set of methods for imputing ICU vital signs and determine the best practice. In reality, ad-hoc imputation techniques that could decrease prediction accuracy, like zero imputation, are still used. In this work, we compare established imputation techniques to guide researchers in improving the performance of clinical prediction models by selecting the most accurate imputation technique. We introduce an extensible and reusable benchmark with currently 15 imputation and 4 amputation methods, created for benchmarking on major ICU datasets. We hope to provide a comparative basis and facilitate further ML development to bring more models into clinical practice.

Problem

Research questions and friction points this paper is trying to address.

Comparing imputation methods for ICU vital signs data gaps

Addressing missing data segments in clinical prediction models

Establishing best practices for ICU time-series imputation techniques

Innovation

Methods, ideas, or system contributions that make the work stand out.

Comparing multiple time-series imputation techniques for ICU data

Creating extensible benchmark with 15 imputation methods

Establishing best practices for clinical prediction model performance

🔎 Similar Papers

Learnable Prompt as Pseudo-Imputation: Rethinking the Necessity of Traditional EHR Data Imputation in Downstream Clinical Prediction