Distributionally Robust Markov Games with Average Reward

📅 2025-08-05

📈 Citations: 0

✨ Influential: 0

career value

308K/year

🤖 AI Summary

This paper addresses long-horizon multi-agent decision-making under transition kernel uncertainty, proposing a distributionally robust Markov game (DR-MG) framework under the average-reward criterion to optimize worst-case sustained performance. Methodologically, it integrates distributionally robust optimization, robust Bellman equation analysis, and average-reward reinforcement learning to devise a computationally tractable robust Nash iteration algorithm. Theoretically, it establishes—for the first time—the foundations of distributionally robust multi-agent games under average reward: proving the existence of robust Nash equilibria and revealing their asymptotic equivalence to equilibria of discounted games. This work bridges a critical theoretical gap in average-reward multi-agent distributional robustness and introduces a new paradigm for policy design in uncertain dynamic environments—one that delivers both rigorous theoretical guarantees and practical implementability.

Technology Category

Application Category

📝 Abstract

This paper introduces the formulation of a distributionally robust Markov game (DR-MG) with average rewards, a crucial framework for multi-agent decision-making under uncertainty over extended horizons. Unlike finite-horizon or discounted models, the average-reward criterion naturally captures long-term performance for systems designed for continuous operation, where sustained reliability is paramount. We account for uncertainty in transition kernels, with players aiming to optimize their worst-case average reward. We first establish a connection between the multi-agent and single agent settings, and derive the solvability of the robust Bellman equation under the average-reward formulation. We then rigorously prove the existence of a robust Nash Equilibrium (NE), offering essential theoretical guarantees for system stability. We further develop and analyze an algorithm named robust Nash-Iteration to compute the robust Nash Equilibria among all agents, providing practical tools for identifying optimal strategies in complex, uncertain, and long-running multi-player environments. Finally, we demonstrate the connection between the average-reward NE and the well-studied discounted NEs, showing that the former can be approximated as the discount factor approaches one. Together, these contributions provide a comprehensive theoretical and algorithmic foundation for identifying optimal strategies in complex, uncertain, and long-running multi-player environments, which allow for the future extension of robust average-reward single-agent problems to the multi-agent setting.

Problem

Research questions and friction points this paper is trying to address.

Formulating distributionally robust Markov games for long-term multi-agent decision-making under uncertainty

Establishing theoretical guarantees for robust Nash Equilibrium in average-reward settings

Developing algorithms to compute optimal strategies in uncertain, continuous multi-agent environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces distributionally robust Markov games with average rewards

Develops robust Nash-Iteration algorithm for equilibria computation

Connects average-reward NE to discounted Nash Equilibria

🔎 Similar Papers

Breaking the Curse of Multiagency in Robust Multi-Agent Reinforcement Learning