Independent Learning in Performative Markov Potential Games

📅 2025-04-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses independent learning in multi-agent reinforcement learning under execution effects—where policy deployment alters environment dynamics—formulated as Markov potential games (MPGs). To overcome the failure of conventional equilibrium concepts in such settings, we introduce the notion of *policy-stable equilibrium* (PSE), the first equilibrium concept tailored to execution effects, and establish its existence rigorously. We further propose two decentralized algorithms: Independent Policy Gradient Ascent (IPGA) and Independent Natural Policy Gradient (INPG). For the first time, we provide last-iterate asymptotic convergence guarantees to exact PSEs; moreover, under mild conditions, we derive finite-time convergence bounds. Extensive experiments validate both the theoretical correctness and practical efficacy of our framework.

Technology Category

Application Category

📝 Abstract
Performative Reinforcement Learning (PRL) refers to a scenario in which the deployed policy changes the reward and transition dynamics of the underlying environment. In this work, we study multi-agent PRL by incorporating performative effects into Markov Potential Games (MPGs). We introduce the notion of a performatively stable equilibrium (PSE) and show that it always exists under a reasonable sensitivity assumption. We then provide convergence results for state-of-the-art algorithms used to solve MPGs. Specifically, we show that independent policy gradient ascent (IPGA) and independent natural policy gradient (INPG) converge to an approximate PSE in the best-iterate sense, with an additional term that accounts for the performative effects. Furthermore, we show that INPG asymptotically converges to a PSE in the last-iterate sense. As the performative effects vanish, we recover the convergence rates from prior work. For a special case of our game, we provide finite-time last-iterate convergence results for a repeated retraining approach, in which agents independently optimize a surrogate objective. We conduct extensive experiments to validate our theoretical findings.
Problem

Research questions and friction points this paper is trying to address.

Studying multi-agent performative reinforcement learning in Markov Potential Games
Analyzing convergence of algorithms to performatively stable equilibrium
Validating theoretical results with extensive experiments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introducing performatively stable equilibrium in MPGs
Convergence of IPGA and INPG to PSE
Finite-time convergence for repeated retraining approach
🔎 Similar Papers
No similar papers found.