🤖 AI Summary
Clinical pathway prediction from electronic health records (EHRs) is challenging due to sparse, highly variable, and knowledge-intensive treatment sequences.
Method: We propose a knowledge-driven Next Activity Prediction (NAP) framework that, for the first time, models the hierarchical structure of the ICD-10-CM/PCS medical ontology as a graph and computes semantic similarity between clinical codes via graph matching—integrating this domain knowledge into event log sequence modeling.
Contribution/Results: Our approach synergistically combines data-driven learning with clinical knowledge guidance, achieving significant improvements in prediction accuracy and F1-score on an MIMIC-IV-derived dataset. It further generates interpretable predictions grounded in disease- or procedure-based clinical pathways. Crucially, it overcomes the limitations of purely data-driven methods in low-resource medical settings, establishing a novel NAP paradigm that jointly ensures high predictive performance and strong clinical interpretability.
📝 Abstract
The rapid progress in modern medicine presents physicians with complex challenges when planning patient treatment. Techniques from the field of Predictive Business Process Monitoring, like Next-activity-prediction (NAP) can be used as a promising technique to support physicians in treatment planning, by proposing a possible next treatment step. Existing patient data, often in the form of electronic health records, can be analyzed to recommend the next suitable step in the treatment process. However, the use of patient data poses many challenges due to its knowledge-intensive character, high variability and scarcity of medical data. To overcome these challenges, this article examines the use of the knowledge encoded in taxonomies to improve and explain the prediction of the next activity in the treatment process. This study proposes the TS4NAP approach, which uses medical taxonomies (ICD-10-CM and ICD-10-PCS) in combination with graph matching to assess the similarities of medical codes to predict the next treatment step. The effectiveness of the proposed approach will be evaluated using event logs that are derived from the MIMIC-IV dataset. The results highlight the potential of using domain-specific knowledge held in taxonomies to improve the prediction of the next activity, and thus can improve treatment planning and decision-making by making the predictions more explainable.