🤖 AI Summary
This work addresses the failure of last-iterate convergence in mirror descent for weakly stable games, where rotational or cyclic transients may prevent convergence. To overcome this limitation, the authors propose a stochastic mirror differential game framework augmented with auxiliary memory states. By employing variational methods, they unify a prediction-correction mechanism that couples stage-wise cost predictions with feedback-driven correction terms, yielding a two-channel predictive mirror dynamics. This approach constitutes the first systematic variational design of a prediction-memory mirror flow, establishing finite-time local terminal error bounds both in expectation and with high probability. Furthermore, it provides estimates of neighborhood escape probabilities, thereby offering local stochastic convergence guarantees for the last iterate—surpassing the constraints of traditional case-by-case predictive algorithm designs.
📝 Abstract
Mirror descent provides a geometric framework for learning in games, but its last-iterate behavior can fail in weakly stable regimes, where the dynamics may exhibit rotational or recurrent transients. Predictive mirror methods mitigate this issue by modifying the feedback entering the mirror update, yet standard predictive variants are typically introduced algorithmically and analyzed one at a time. This letter gives a variational route to predictive feedback by constructing a stochastic mirror differential game with an auxiliary memory state. Its stage cost couples two Fenchel terms: a strategic term evaluated at a predicted profile and a corrective term driven by realized feedback. The resulting equilibrium feedback induces two-channel predictive mirror dynamics in general mirror geometry. Under local mirror regularity, a quantitative local Bregman growth condition, and bounded Brownian diffusion, we establish finite-horizon local terminal-time bounds in expectation and with high probability, together with an exit-probability estimate for the localization neighborhood. The result provides a unified variational construction of the induced predictive-memory mirror flow together with a local stochastic certificate for last-iterate performance near stable equilibria.