🤖 AI Summary
The COVID-19 pandemic severely disrupted routine monitoring of key cardiovascular disease (CVD) biomarkers—including LDL-C, HbA1c, BMI, and systolic blood pressure—posing challenges for chronic disease management. Addressing critical gaps in existing work—namely, the lack of multi-biomarker joint prediction, temporal dynamic modeling, and uncertainty quantification—we propose the first Bayesian Transformer framework tailored for electronic health record (EHR) data. Our method integrates a pre-trained BERT architecture, variational Bayesian inference, temporal embeddings, and the DeepMTR structure, augmented with attention mechanisms to capture inter-biomarker dependencies and longitudinal correlations. Evaluated on 3,390 patients, it achieves MAE = 0.00887 and RMSE = 0.0135—significantly outperforming baseline models. Crucially, it simultaneously outputs predictive means and dual uncertainty estimates (data and model), enabling interpretable, robust support for remote CVD management and clinical decision-making.
📝 Abstract
The COVID-19 pandemic disrupted healthcare systems worldwide, disproportionately impacting individuals with chronic conditions such as cardiovascular disease (CVD). These disruptions -- through delayed care and behavioral changes, affected key CVD biomarkers, including LDL cholesterol (LDL-C), HbA1c, BMI, and systolic blood pressure (SysBP). Accurate modeling of these changes is crucial for predicting disease progression and guiding preventive care. However, prior work has not addressed multi-target prediction of CVD biomarker from Electronic Health Records (EHRs) using machine learning (ML), while jointly capturing biomarker interdependencies, temporal patterns, and predictive uncertainty. In this paper, we propose MBT-CB, a Multi-target Bayesian Transformer (MBT) with pre-trained BERT-based transformer framework to jointly predict LDL-C, HbA1c, BMI and SysBP CVD biomarkers from EHR data. The model leverages Bayesian Variational Inference to estimate uncertainties, embeddings to capture temporal relationships and a DeepMTR model to capture biomarker inter-relationships. We evaluate MBT-CT on retrospective EHR data from 3,390 CVD patient records (304 unique patients) in Central Massachusetts during the Covid-19 pandemic. MBT-CB outperformed a comprehensive set of baselines including other BERT-based ML models, achieving an MAE of 0.00887, RMSE of 0.0135 and MSE of 0.00027, while effectively capturing data and model uncertainty, patient biomarker inter-relationships, and temporal dynamics via its attention and embedding mechanisms. MBT-CB's superior performance highlights its potential to improve CVD biomarker prediction and support clinical decision-making during pandemics.