Test-Time Learning and Inference-Time Deliberation for Efficiency-First Offline Reinforcement Learning in Care Coordination and Population Health Management

📅 2025-09-19

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses scalable health management for Medicaid and safety-net populations, tackling the joint optimization of time cost, clinical risk, and auditability across multimodal interventions (SMS, phone calls, video visits, in-person encounters). We propose TTL+ITD, a lightweight offline reinforcement learning framework that innovatively integrates test-time learning with local neighborhood calibration and small-scale Q-ensemble inference—explicitly modeling both prediction uncertainty and temporal cost. A tunable transparency parameter enables subgroup-level impact auditing and explicit efficiency–fairness trade-offs. The method ensures policy interpretability and training traceability. Evaluated on real-world de-identified operational data, TTL+ITD demonstrates robust value estimation and fine-grained subgroup effect assessment, supporting accountable, equitable, and clinically actionable decision-making.

Technology Category

Application Category

📝 Abstract

Care coordination and population health management programs serve large Medicaid and safety-net populations and must be auditable, efficient, and adaptable. While clinical risk for outreach modalities is typically low, time and opportunity costs differ substantially across text, phone, video, and in-person visits. We propose a lightweight offline reinforcement learning (RL) approach that augments trained policies with (i) test-time learning via local neighborhood calibration, and (ii) inference-time deliberation via a small Q-ensemble that incorporates predictive uncertainty and time/effort cost. The method exposes transparent dials for neighborhood size and uncertainty/cost penalties and preserves an auditable training pipeline. Evaluated on a de-identified operational dataset, TTL+ITD achieves stable value estimates with predictable efficiency trade-offs and subgroup auditing.

Problem

Research questions and friction points this paper is trying to address.

Optimizing care coordination efficiency across different outreach modalities

Developing auditable reinforcement learning for population health management

Balancing clinical risk with time and opportunity costs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Test-time learning via local neighborhood calibration

Inference-time deliberation using small Q-ensemble

Incorporates predictive uncertainty and time cost

🔎 Similar Papers

No similar papers found.

Authors to Follow