🤖 AI Summary
This paper investigates global learning dynamics in memory-asymmetric two-player zero-sum games: Player X employs a reactive strategy based on Y’s previous action, while Y is memoryless. To address the complex convergence behavior induced by this asymmetry, we introduce an extended KL divergence and a family of reactive-strategy Lyapunov functions. Leveraging tools from nonlinear dynamical systems theory, information geometry, and Lyapunov stability analysis, we rigorously characterize the global convergence mechanism to Nash equilibria and the evolution of strategy exploitability. We prove—supported by numerical experiments—that whenever X exploits Y, the system globally converges to a Nash equilibrium; conversely, any deviation of Y from equilibrium triggers persistent growth in X’s exploitability. The core contribution is a novel information-theoretic–dynamical framework specifically tailored to memory-asymmetric strategic interactions.
📝 Abstract
This study examines the global behavior of dynamics in learning in games between two players, X and Y. We consider the simplest situation for memory asymmetry between two players: X memorizes the other Y's previous action and uses reactive strategies, while Y has no memory. Although this memory complicates their learning dynamics, we characterize the global behavior of such complex dynamics by discovering and analyzing two novel quantities. One is an extended Kullback-Leibler divergence from the Nash equilibrium, a well-known conserved quantity from previous studies. The other is a family of Lyapunov functions of X's reactive strategy. One of the global behaviors we capture is that if X exploits Y, then their strategies converge to the Nash equilibrium. Another is that if Y's strategy is out of equilibrium, then X becomes more exploitative with time. Consequently, we suggest global convergence to the Nash equilibrium from both aspects of theory and experiment. This study provides a novel characterization of the global behavior in learning in games through a couple of indicators.