Randomization for Faster Exact Optimization of Discounted Markov Decision Processes

📅 2026-06-03
📈 Citations: 0
Influential: 0
📄 PDF

career value

231K/year
🤖 AI Summary
This work addresses the problem of efficiently and exactly solving discounted Markov decision processes (DMDPs) for the optimal value function and policy. We propose a novel reduction framework that decomposes the exact solution into two subproblems: policy evaluation and computation of an approximately optimal value function. Leveraging state-of-the-art techniques in approximate dynamic programming, we design both deterministic and randomized algorithms tailored to these subtasks. Our approach achieves significantly improved computational efficiency, yielding the fastest known exact DMDP solver to date. The resulting algorithms demonstrate clear advantages over existing methods, both theoretically—through tighter complexity bounds—and empirically—via superior practical performance.
📝 Abstract
We provide faster deterministic and randomized algorithms for exactly solving discounted Markov Decision Processes (DMDPs). We obtain our results by efficiently reducing computing optimal values and policies in DMDPs to the easier tasks of policy evaluation and computing approximately optimal values in DMDPs. We provide both a straightforward deterministic reduction and a more efficient randomized variant that, together with advances in approximately solving DMDPs, yield our results.
Problem

Research questions and friction points this paper is trying to address.

Markov Decision Processes
discounted MDPs
exact optimization
randomization
policy evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Markov Decision Processes
randomization
exact optimization
policy evaluation
approximate value computation