Feasibility-Guided Fair Adaptive Offline Reinforcement Learning for Medicaid Care Management

📅 2025-09-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses safety and fairness challenges in offline reinforcement learning for Medicaid care management. We propose a grouped safety-threshold adaptation method that jointly performs group-aware safety calibration and fairness optimization—targeting either coverage or harm parity across protected subgroups—while preserving policy value via a feasibility-guided mechanism. The method operates on de-identified longitudinal healthcare trajectory data, integrating behavior cloning, HACO baseline comparison, bootstrapped 95% confidence interval estimation, and subgroup difference significance testing. Compared to a global safety-constrained baseline, our approach maintains comparable policy value while significantly improving fairness metrics (p < 0.01), demonstrating the practical feasibility of jointly ensuring safety and subgroup fairness in real-world Medicaid programs. The core contribution lies in decoupling rigid global safety constraints into subgroup-sensitive, dynamically adjusted thresholds, with fairness optimization provably grounded in feasibility guarantees.

Technology Category

Application Category

📝 Abstract
We introduce Feasibility-Guided Fair Adaptive Reinforcement Learning (FG-FARL), an offline RL procedure that calibrates per-group safety thresholds to reduce harm while equalizing a chosen fairness target (coverage or harm) across protected subgroups. Using de-identified longitudinal trajectories from a Medicaid population health management program, we evaluate FG-FARL against behavior cloning (BC) and HACO (Hybrid Adaptive Conformal Offline RL; a global conformal safety baseline). We report off-policy value estimates with bootstrap 95% confidence intervals and subgroup disparity analyses with p-values. FG-FARL achieves comparable value to baselines while improving fairness metrics, demonstrating a practical path to safer and more equitable decision support.
Problem

Research questions and friction points this paper is trying to address.

Calibrates safety thresholds per subgroup to reduce harm
Equalizes fairness targets across protected Medicaid subgroups
Improves safety and equity in offline reinforcement learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Feasibility-guided adaptive offline reinforcement learning
Calibrates per-group safety thresholds
Improves fairness metrics while maintaining value
🔎 Similar Papers
No similar papers found.
S
Sanjay Basu
Waymark, San Francisco, CA, USA; San Francisco General Hospital, University of California San Francisco, San Francisco, CA, USA
S
Sadiq Y. Patel
Waymark, San Francisco, CA, USA; University of Pennsylvania, Philadelphia, PA, USA
Parth Sheth
Parth Sheth
University of Pennsylvania
Machine learningData Science
B
Bhairavi Muralidharan
Waymark, San Francisco, CA, USA
N
Namrata Elamaran
Waymark, San Francisco, CA, USA
A
Aakriti Kinra
Waymark, San Francisco, CA, USA
R
Rajaie Batniji
Waymark, San Francisco, CA, USA