Optimal strategies in Markov decision processes with finitely additive evaluations

📅 2026-03-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the existence of optimal policies in infinite-horizon Markov decision processes (MDPs) with finite state and action spaces when stage rewards are aggregated using finitely additive charges—i.e., finitely additive probability measures. By constructing a carefully designed counterexample, the authors demonstrate that, absent additional constraints on the aggregation mechanism—such as a time-preference principle—optimal policies may fail to exist even in finite MDPs, regardless of whether deterministic or randomized policies are considered. Integrating tools from measure theory, finitely additive probability, and MDP theory, this work refutes the conjecture that optimal policies necessarily exist without further assumptions, thereby highlighting the critical role of the reward aggregation scheme in determining policy existence.

Technology Category

Application Category

📝 Abstract
We study infinite-horizon Markov decision processes (MDPs) where the decision maker evaluates each of her strategies by aggregating the infinite stream of expected stage-rewards. The crucial feature of our approach is that the aggregation is performed by means of a given diffuse charge (a diffuse finitely additive probability measure) on the set of stages. The results of Neyman [2023] imply that in this setting, in every MDP with finite state and action spaces, the decision maker has a pure optimal strategy as long as the diffuse charge satisfies the time value of money principle. His result raises the question of existence of an optimal strategy without additional assumptions on the aggregation charge. We answer this question in the negative with a counterexample. With a delicately constructed aggregation charge, the MDP has no optimal strategy at all, neither pure nor randomized.
Problem

Research questions and friction points this paper is trying to address.

Markov decision processes
finitely additive evaluations
diffuse charge
optimal strategy
infinite-horizon
Innovation

Methods, ideas, or system contributions that make the work stand out.

Markov decision processes
finitely additive measures
diffuse charge
optimal strategy
infinite-horizon
🔎 Similar Papers
No similar papers found.
János Flesch
János Flesch
Maastricht University
Game theory
A
Arkadi Predtetchinski
Department of Microeconomics and Public Economics, Maastricht University, P.O. Box 616, 6200 MD, The Netherlands
W
William D Sudderth
School of Statistics, University of Minnesota, Minneapolis, MN 55455, United States
Xavier Venel
Xavier Venel
LUISS Guido Carli
Game theory