MIEO: encoding clinical data to enhance cardiovascular event prediction

📅 2025-10-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Clinical time-series data suffer from severe missingness due to label scarcity and patient heterogeneity. To address this, we propose MIEO, a missingness-aware self-supervised representation learning framework. MIEO employs a novel missingness-aware autoencoder to learn robust latent patient representations from large-scale unlabeled electronic health records (EHRs), effectively mitigating data sparsity and missing-value interference. These representations are integrated with a lightweight classifier for cardiovascular mortality risk prediction. Evaluated on a real-world ischemic heart disease dataset, MIEO achieves a +4.2% improvement in balanced accuracy over state-of-the-art semi-supervised and imputation-based methods. To our knowledge, this is the first work to incorporate missingness-aware self-supervised encoding into clinical time-series representation learning. It demonstrates strong efficacy and generalizability under low-label-resource settings, offering a promising direction for robust predictive modeling in real-world EHR analytics.

Technology Category

Application Category

📝 Abstract
As clinical data are becoming increasingly available, machine learning methods have been employed to extract knowledge from them and predict clinical events. While promising, approaches suffer from at least two main issues: low availability of labelled data and data heterogeneity leading to missing values. This work proposes the use of self-supervised auto-encoders to efficiently address these challenges. We apply our methodology to a clinical dataset from patients with ischaemic heart disease. Patient data is embedded in a latent space, built using unlabelled data, which is then used to train a neural network classifier to predict cardiovascular death. Results show improved balanced accuracy compared to applying the classifier directly to the raw data, demonstrating that this solution is promising, especially in conditions where availability of unlabelled data could increase.
Problem

Research questions and friction points this paper is trying to address.

Predicting cardiovascular death using clinical data
Addressing low availability of labeled clinical data
Handling data heterogeneity and missing values
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised auto-encoders address data heterogeneity
Latent space embedding built using unlabeled clinical data
Neural network classifier predicts cardiovascular death events
🔎 Similar Papers
No similar papers found.
D
Davide Borghini
Department of Computer Science, University of Pisa, Pisa, Italy
D
Davide Marchi
Department of Computer Science, University of Pisa, Pisa, Italy
A
Angelo Nardone
Department of Computer Science, University of Pisa, Pisa, Italy
G
Giordano Scerra
Department of Computer Science, University of Pisa, Pisa, Italy
S
Silvia Giulia Galfrè
Department of Computer Science, University of Pisa, Pisa, Italy
A
Alessandro Pingitore
Clinical Physiology Institute, CNR, Pisa, Italy
Giuseppe Prencipe
Giuseppe Prencipe
Dipartimento di Informatica, Universita' di Pisa
Digital HealthDistributed ComputingAutonomous Mobile Robots
C
Corrado Priami
Department of Computer Science, University of Pisa, Pisa, Italy
Alina Sîrbu
Alina Sîrbu
Computer Science Department, University of Pisa, Italy
computational biologycomplex systemsmachine learningsocial network analysismigration