🤖 AI Summary
This study addresses scalable health management for Medicaid and safety-net populations, tackling the joint optimization of time cost, clinical risk, and auditability across multimodal interventions (SMS, phone calls, video visits, in-person encounters). We propose TTL+ITD, a lightweight offline reinforcement learning framework that innovatively integrates test-time learning with local neighborhood calibration and small-scale Q-ensemble inference—explicitly modeling both prediction uncertainty and temporal cost. A tunable transparency parameter enables subgroup-level impact auditing and explicit efficiency–fairness trade-offs. The method ensures policy interpretability and training traceability. Evaluated on real-world de-identified operational data, TTL+ITD demonstrates robust value estimation and fine-grained subgroup effect assessment, supporting accountable, equitable, and clinically actionable decision-making.
📝 Abstract
Care coordination and population health management programs serve large Medicaid and safety-net populations and must be auditable, efficient, and adaptable. While clinical risk for outreach modalities is typically low, time and opportunity costs differ substantially across text, phone, video, and in-person visits. We propose a lightweight offline reinforcement learning (RL) approach that augments trained policies with (i) test-time learning via local neighborhood calibration, and (ii) inference-time deliberation via a small Q-ensemble that incorporates predictive uncertainty and time/effort cost. The method exposes transparent dials for neighborhood size and uncertainty/cost penalties and preserves an auditable training pipeline. Evaluated on a de-identified operational dataset, TTL+ITD achieves stable value estimates with predictable efficiency trade-offs and subgroup auditing.