🤖 AI Summary
This study addresses the challenge of predicting prolonged hospital stays (>7 days) among older adults in resource-constrained settings of low- and middle-income countries, with the goal of reducing adverse in-hospital events. Leveraging patient and hospital administrative data available at admission, the authors propose a novel feature selection method that integrates information value analysis with graph-theoretic clique structures to identify nine non-redundant, highly interpretable variables. An interpretable logistic regression model built on these features achieves an AUC-ROC of 0.82, accuracy of 0.76, specificity of 0.83, and sensitivity of 0.64 on the validation set. The approach maintains strong predictive performance while significantly enhancing clinical transparency and practical utility for deployment in low-resource healthcare environments.
📝 Abstract
Prolonged length of stay (pLoS) is a significant factor associated with the risk of adverse in-hospital events. We develop and explain a predictive model for pLos using admission-level patient and hospital administrative data. The approach includes a feature selection method by selecting non-correlated features with the highest information value. The method uses features weights of evidence to select a representative within cliques from graph theory. The prognosis study analyzed the records from 120,354 hospital admissions at the Hospital Alma Mater de Antioquia between January 2017 and March 2022. After a cleaning process the dataset was split into training (67%), test (22%), and validation (11%) cohorts. A logistic regression model was trained to predict the pLoS in two classes: less than or greater than 7 days. The performance of the model was evaluated using accuracy, precision, sensitivity, specificity, and AUC-ROC metrics. The feature selection method returns nine interpretable variables, enhancing the models'transparency. In the validation cohort, the pLoS model achieved a specificity of 0.83 (95% CI, 0.82-0.84), sensitivity of 0.64 (95% CI, 0.62-0.65), accuracy of 0.76 (95% CI, 0.76-0.77), precision of 0.67 (95% CI, 0.66-0.69), and AUC-ROC of 0.82 (95% CI, 0.81-0.83). The model exhibits strong predictive performance and offers insights into the factors that influence prolonged hospital stays. This makes it a valuable tool for hospital management and for developing future intervention studies aimed at reducing pLoS.