🤖 AI Summary
Evaluating joint-strategy stability in dynamic multi-agent games remains challenging due to non-stationary environments and complex inter-agent dependencies.
Method: This paper proposes a novel analytical framework integrating empirical game modeling with α-Rank evolutionary ranking. It applies α-Rank systematically to empirical games derived from dynamic interactions, leveraging long-term evolutionary dynamics to jointly assess robustness and payoff equilibrium of joint strategies. The approach combines DQN-based policy training with stochastic graph coloring—a collaborative, interpretable simulation environment—to enable transparent, high-order stability quantification.
Contribution/Results: The framework achieves principled, scalable joint-strategy evaluation by balancing stability and evolutionary advantage. Experiments demonstrate its effectiveness in identifying joint strategies that exhibit both long-term stability and superior evolutionary fitness in complex, dynamic multi-agent settings, while maintaining computational tractability and interpretability.
📝 Abstract
Game-theoretic solution concepts, such as the Nash equilibrium, have been key to finding stable joint actions in multi-player games. However, it has been shown that the dynamics of agents' interactions, even in simple two-player games with few strategies, are incapable of reaching Nash equilibria, exhibiting complex and unpredictable behavior. Instead, evolutionary approaches can describe the long-term persistence of strategies and filter out transient ones, accounting for the long-term dynamics of agents' interactions. Our goal is to identify agents' joint strategies that result in stable behavior, being resistant to changes, while also accounting for agents' payoffs, in dynamic games. Towards this goal, and building on previous results, this paper proposes transforming dynamic games into their empirical forms by considering agents' strategies instead of agents' actions, and applying the evolutionary methodology $alpha$-Rank to evaluate and rank strategy profiles according to their long-term dynamics. This methodology not only allows us to identify joint strategies that are strong through agents' long-term interactions, but also provides a descriptive, transparent framework regarding the high ranking of these strategies. Experiments report on agents that aim to collaboratively solve a stochastic version of the graph coloring problem. We consider different styles of play as strategies to define the empirical game, and train policies realizing these strategies, using the DQN algorithm. Then we run simulations to generate the payoff matrix required by $alpha$-Rank to rank joint strategies.