An Explainable Disease Surveillance System for Early Prediction of Multiple Chronic Diseases

πŸ“… 2025-01-27
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Chronic disease patients face challenges in long-term health risk prediction and lack of model interpretability. Method: We propose a multi-morbidity interpretable monitoring system leveraging routine electronic health records (EHRs) to predict 3-, 6-, and 12-month disease exacerbation risks without requiring laboratory tests. Our approach introduces a rule-augmented random forest framework, integrating SHAP-based feature attribution with a clinically validated surrogate model, and formalizes decision logic via a structured rule engineβ€”all interpretations rigorously validated by multidisciplinary clinical experts. Contribution/Results: Evaluated on multicenter real-world EHR data with internal cross-validation, the system achieves high discriminative performance (AUROC >0.85) and robustness (F1 >0.78), significantly outperforming baseline models. It has been deployed in the CureMD EMR platform, enabling real-time clinical risk stratification and evidence-informed intervention decisions.

Technology Category

Application Category

πŸ“ Abstract
This study addresses a critical gap in the healthcare system by developing a clinically meaningful, practical, and explainable disease surveillance system for multiple chronic diseases, utilizing routine EHR data from multiple U.S. practices integrated with CureMD's EMR/EHR system. Unlike traditional systems--using AI models that rely on features from patients' labs--our approach focuses on routinely available data, such as medical history, vitals, diagnoses, and medications, to preemptively assess the risks of chronic diseases in the next year. We trained three distinct models for each chronic disease: prediction models that forecast the risk of a disease 3, 6, and 12 months before a potential diagnosis. We developed Random Forest models, which were internally validated using F1 scores and AUROC as performance metrics and further evaluated by a panel of expert physicians for clinical relevance based on inferences grounded in medical knowledge. Additionally, we discuss our implementation of integrating these models into a practical EMR system. Beyond using Shapley attributes and surrogate models for explainability, we also introduce a new rule-engineering framework to enhance the intrinsic explainability of Random Forests.
Problem

Research questions and friction points this paper is trying to address.

Chronic Disease Prediction
Health Monitoring System
Patient Care Efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Clinical Data Prediction
Random Forest Algorithm
Chronic Disease Risk Modeling
πŸ”Ž Similar Papers
No similar papers found.
S
Shaheer Ahmad Khan
CureMD Research, 80 Pine St 21st Floor, New York, NY 10005, United States
M
Muhammad Usamah Shahid
CureMD Research, 80 Pine St 21st Floor, New York, NY 10005, United States
A
Ahmad Abdullah
CureClinic, 30 Davis road, Lahore, Pakistan
I
Ibrahim Hashmat
CureMD Research, 80 Pine St 21st Floor, New York, NY 10005, United States
Muddassar Farooq
Muddassar Farooq
Professor & Dean, Air University
AI & MLIntelligent Healthcare SystemsComputer and Network SecurityBio-inspired ComputingSwarm Intelligence