🤖 AI Summary
This work investigates whether equilibrium solutions can be attained in N-player linear-quadratic (LQ) stochastic differential games when each agent performs independent policy gradient updates based solely on its own state. By uncovering, for the first time, that asymmetric LQ games still admit an α-potential structure, the analysis unifies convergence guarantees for both symmetric and asymmetric settings. Leveraging this structure together with projected gradient methods and stochastic differential game theory, the study establishes global linear convergence: in the symmetric case, the algorithm converges globally with complexity linear in the number of agents and logarithmic in the desired accuracy; in the asymmetric case, it converges linearly to an approximate equilibrium, with the approximation error proportional to the degree of asymmetry in the game.
📝 Abstract
We analyze independent policy-gradient (PG) learning in $N$-player linear-quadratic (LQ) stochastic differential games. Each player employs a distributed policy that depends only on its own state and updates the policy independently using the gradient of its own objective. We establish global linear convergence of these methods to an equilibrium by showing that the LQ game admits an $α$-potential structure, with $α$ determined by the degree of pairwise interaction asymmetry. For pairwise-symmetric interactions, we construct an affine distributed equilibrium by minimizing the potential function and show that independent PG methods converge globally to this equilibrium, with complexity scaling linearly in the population size and logarithmically in the desired accuracy. For asymmetric interactions, we prove that independent projected PG algorithms converge linearly to an approximate equilibrium, with suboptimality proportional to the degree of asymmetry. Numerical experiments confirm the theoretical results across both symmetric and asymmetric interaction networks.