🤖 AI Summary
Modeling heterogeneous, asynchronous, high-dimensional, and mixed-structured/unstructured clinical data in intensive care units (ICUs) remains a fundamental challenge. To address this, we propose ICU-BERT—the first Transformer-based pretraining framework specifically designed for critical care settings. Our method introduces (1) a multi-task self-supervised pretraining paradigm coupled with a multi-token asynchronous input strategy to accommodate irregular temporal dynamics, and (2) integration of dense semantic embeddings from biomedical large language models, enabling end-to-end, feature-engineering-free robust representation learning. Trained on MIMIC-IV, ICU-BERT achieves state-of-the-art performance across five diverse downstream clinical tasks—including mortality risk prediction and ICD coding—as well as on four independent external ICU datasets. Results demonstrate substantial improvements in generalizability and clinical utility of AI-driven decision support systems, validating its effectiveness in real-world critical care applications.
📝 Abstract
The multivariate, asynchronous nature of real-world clinical data, such as that generated in Intensive Care Units (ICUs), challenges traditional AI-based decision-support systems. These often assume data regularity and feature independence and frequently rely on limited data scopes and manual feature engineering. The potential of generative AI technologies has not yet been fully exploited to analyze clinical data. We introduce ICU-BERT, a transformer-based model pre-trained on the MIMIC-IV database using a multi-task scheme to learn robust representations of complex ICU data with minimal preprocessing. ICU-BERT employs a multi-token input strategy, incorporating dense embeddings from a biomedical Large Language Model to learn a generalizable representation of complex and multivariate ICU data. With an initial evaluation of five tasks and four additional ICU datasets, ICU-BERT results indicate that ICU-BERT either compares to or surpasses current performance benchmarks by leveraging fine-tuning. By integrating structured and unstructured data, ICU-BERT advances the use of foundational models in medical informatics, offering an adaptable solution for clinical decision support across diverse applications.