MIEO: encoding clinical data to enhance cardiovascular event prediction

📅 2025-10-13

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

Clinical time-series data suffer from severe missingness due to label scarcity and patient heterogeneity. To address this, we propose MIEO, a missingness-aware self-supervised representation learning framework. MIEO employs a novel missingness-aware autoencoder to learn robust latent patient representations from large-scale unlabeled electronic health records (EHRs), effectively mitigating data sparsity and missing-value interference. These representations are integrated with a lightweight classifier for cardiovascular mortality risk prediction. Evaluated on a real-world ischemic heart disease dataset, MIEO achieves a +4.2% improvement in balanced accuracy over state-of-the-art semi-supervised and imputation-based methods. To our knowledge, this is the first work to incorporate missingness-aware self-supervised encoding into clinical time-series representation learning. It demonstrates strong efficacy and generalizability under low-label-resource settings, offering a promising direction for robust predictive modeling in real-world EHR analytics.

Technology Category

Application Category

📝 Abstract

As clinical data are becoming increasingly available, machine learning methods have been employed to extract knowledge from them and predict clinical events. While promising, approaches suffer from at least two main issues: low availability of labelled data and data heterogeneity leading to missing values. This work proposes the use of self-supervised auto-encoders to efficiently address these challenges. We apply our methodology to a clinical dataset from patients with ischaemic heart disease. Patient data is embedded in a latent space, built using unlabelled data, which is then used to train a neural network classifier to predict cardiovascular death. Results show improved balanced accuracy compared to applying the classifier directly to the raw data, demonstrating that this solution is promising, especially in conditions where availability of unlabelled data could increase.

Problem

Research questions and friction points this paper is trying to address.

Predicting cardiovascular death using clinical data

Addressing low availability of labeled clinical data

Handling data heterogeneity and missing values

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised auto-encoders address data heterogeneity

Latent space embedding built using unlabeled clinical data

Neural network classifier predicts cardiovascular death events

🔎 Similar Papers

Fusing Echocardiography Images and Medical Records for Continuous Patient Stratification