Integrative Analysis and Imputation of Multiple Data Streams via Deep Gaussian Processes

📅 2025-05-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Intensive care data present three key challenges: (1) multimodal physiological signals exhibit intrinsic interdependencies yet are commonly modeled in isolation; (2) observations are irregularly sampled with clinically meaningful timing; and (3) missing values are pervasive, while existing imputation methods neglect temporal structure and uncertainty quantification. To address these, we propose the first unified modeling framework integrating deep Gaussian processes (DGPs) with stochastic imputation. Our approach jointly captures longitudinal dynamics and cross-sectional, cross-modal dependencies to enable coherent imputation of asynchronous, multivariate time series, while producing well-calibrated uncertainty estimates. Evaluated on real-world clinical datasets, our method significantly outperforms state-of-the-art baselines—including MICE, last-observation-carried-forward, and univariate Gaussian processes—achieving new state-of-the-art performance in both imputation accuracy and uncertainty calibration.

Technology Category

Application Category

📝 Abstract
Healthcare data, particularly in critical care settings, presents three key challenges for analysis. First, physiological measurements come from different sources but are inherently related. Yet, traditional methods often treat each measurement type independently, losing valuable information about their relationships. Second, clinical measurements are collected at irregular intervals, and these sampling times can carry clinical meaning. Finally, the prevalence of missing values. Whilst several imputation methods exist to tackle this common problem, they often fail to address the temporal nature of the data or provide estimates of uncertainty in their predictions. We propose using deep Gaussian process emulation with stochastic imputation, a methodology initially conceived to deal with computationally expensive models and uncertainty quantification, to solve the problem of handling missing values that naturally occur in critical care data. This method leverages longitudinal and cross-sectional information and provides uncertainty estimation for the imputed values. Our evaluation of a clinical dataset shows that the proposed method performs better than conventional methods, such as multiple imputations with chained equations (MICE), last-known value imputation, and individually fitted Gaussian Processes (GPs).
Problem

Research questions and friction points this paper is trying to address.

Handling missing values in irregularly sampled clinical data
Capturing relationships between diverse physiological measurements
Providing uncertainty estimates for imputed healthcare data values
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep Gaussian processes for data imputation
Handles irregular and missing clinical measurements
Provides uncertainty estimates for imputed values
🔎 Similar Papers
No similar papers found.