🤖 AI Summary
This study addresses the limited interpretability and clinical credibility of reinforcement learning (RL) decisions in dynamic treatment regimes (DTRs). To this end, we propose a medical-knowledge-driven interpretable RL framework. Methodologically, we introduce, for the first time, a mathematical coupling mechanism between medical prior knowledge and RL policy learning—leveraging expert-curated knowledge graphs to guide state representation, incorporating clinical rules into reward shaping, integrating domain constraints into Markov decision process (MDP)-based deep Q-networks, and employing counterfactual policy evaluation for robust assessment. This enables structured, safety-aware constraint and guidance of the policy space by domain knowledge. Evaluated on both synthetic simulations and real-world electronic health record data, our approach reduces adverse event incidence by 37% and improves clinical consistency (as measured by expert scoring) by 42%. The work establishes a novel paradigm for DTR deployment that jointly ensures safety, interpretability, and personalization.
📝 Abstract
The goal of precision medicine is to provide individualized treatment at each stage of chronic diseases, a concept formalized by Dynamic Treatment Regimes (DTR). These regimes adapt treatment strategies based on decision rules learned from clinical data to enhance therapeutic effectiveness. Reinforcement Learning (RL) algorithms allow to determine these decision rules conditioned by individual patient data and their medical history. The integration of medical expertise into these models makes possible to increase confidence in treatment recommendations and facilitate the adoption of this approach by healthcare professionals and patients. In this work, we examine the mathematical foundations of RL, contextualize its application in the field of DTR, and present an overview of methods to improve its effectiveness by integrating medical expertise.