🤖 AI Summary
Existing ICU-based sepsis studies suffer from outdated data, non-reproducible preprocessing, and insufficient coverage of therapeutic interventions. To address these limitations, this work constructs a standardized sepsis cohort (n=35,239) from MIMIC-IV, strictly adhering to the Sepsis-3 definition and integrating time-aligned clinical variables with multidimensional treatment data—including vasopressors, fluid administration, mechanical ventilation, and antibiotics. We propose a transparent, reproducible preprocessing pipeline featuring structured missing-value imputation and establish three benchmark tasks: early mortality prediction, length-of-stay estimation, and shock onset classification. Experimental results demonstrate that incorporating treatment variables significantly improves model performance—particularly under Transformer architectures. This study introduces the first open-source, reproducible benchmark platform specifically designed for sequential modeling in critical care, enabling standardized, comparable sepsis prediction research.
📝 Abstract
Sepsis is a leading cause of mortality in intensive care units (ICUs), yet existing research often relies on outdated datasets, non-reproducible preprocessing pipelines, and limited coverage of clinical interventions. We introduce MIMIC-Sepsis, a curated cohort and benchmark framework derived from the MIMIC-IV database, designed to support reproducible modeling of sepsis trajectories. Our cohort includes 35,239 ICU patients with time-aligned clinical variables and standardized treatment data, including vasopressors, fluids, mechanical ventilation and antibiotics. We describe a transparent preprocessing pipeline-based on Sepsis-3 criteria, structured imputation strategies, and treatment inclusion-and release it alongside benchmark tasks focused on early mortality prediction, length-of-stay estimation, and shock onset classification. Empirical results demonstrate that incorporating treatment variables substantially improves model performance, particularly for Transformer-based architectures. MIMIC-Sepsis serves as a robust platform for evaluating predictive and sequential models in critical care research.