🤖 AI Summary
Evaluating reinforcement learning (RL) policies for optimizing agent-based epidemiological models (ABMs/RABMs) remains challenging due to system complexity, stochasticity, and the absence of domain-informed evaluation metrics aligned with public health objectives.
Method: This paper proposes a domain-driven, integrated evaluation framework that translates key epidemiological goals—such as mask-wearing adherence, vaccination coverage, and lockdown intensity—into quantifiable metrics. These are jointly modeled with conventional RL metrics (e.g., cumulative reward, convergence speed) and augmented with assessments of dynamic responsiveness under resource constraints (e.g., fluctuating mask supply).
Contribution/Results: Experiments across diverse epidemic scenarios demonstrate that the proposed metrics significantly improve policy discriminability and enhance alignment between RL-driven decisions and real-world public health outcomes. The framework establishes the first interpretable, verifiable, and epidemiology-semantics-aware evaluation paradigm for RL-optimized RABMs.
📝 Abstract
For the development and optimization of agent-based models (ABMs) and rational agent-based models (RABMs), optimization algorithms such as reinforcement learning are extensively used. However, assessing the performance of RL-based ABMs and RABMS models is challenging due to the complexity and stochasticity of the modeled systems, and the lack of well-standardized metrics for comparing RL algorithms. In this study, we are developing domain-driven metrics for RL, while building on state-of-the-art metrics. We demonstrate our ``Domain-driven-RL-metrics'' using policy optimization on a rational ABM disease modeling case study to model masking behavior, vaccination, and lockdown in a pandemic. Our results show the use of domain-driven rewards in conjunction with traditional and state-of-the-art metrics for a few different simulation scenarios such as the differential availability of masks.