🤖 AI Summary
Existing notions of regret fail to uniformly characterize learning algorithms’ resilience against strategic manipulation in generalized games, including Bayesian and extensive-form games.
Method: We introduce *profile swap regret*, a novel regret measure grounded in polyhedral game modeling, integrating online convex optimization and approximation algorithm design.
Contribution/Results: We establish, for the first time, a necessary and sufficient condition linking sublinear profile swap regret to non-manipulability. We prove that the induced equilibrium set is strictly contained in the set of correlated equilibria—surpassing the limitations of normal-form game analysis. Furthermore, we devise the first efficient algorithm achieving an $O(sqrt{T})$ regret bound and rigorously characterize the boundary of its convergent equilibrium structure, thereby resolving an open problem posed by Mansour et al. (2022).
📝 Abstract
Swap regret is a notion that has proven itself to be central to the study of general-sum normal-form games, with swap-regret minimization leading to convergence to the set of correlated equilibria and guaranteeing non-manipulability against a self-interested opponent. However, the situation for more general classes of games -- such as Bayesian games and extensive-form games -- is less clear-cut, with multiple candidate definitions for swap-regret but no known efficiently minimizable variant of swap regret that implies analogous non-manipulability guarantees. In this paper, we present a new variant of swap regret for polytope games that we call ``profile swap regret'', with the property that obtaining sublinear profile swap regret is both necessary and sufficient for any learning algorithm to be non-manipulable by an opponent (resolving an open problem of Mansour et al., 2022). Although we show profile swap regret is NP-hard to compute given a transcript of play, we show it is nonetheless possible to design efficient learning algorithms that guarantee at most $O(sqrt{T})$ profile swap regret. Finally, we explore the correlated equilibrium notion induced by low-profile-swap-regret play, and demonstrate a gap between the set of outcomes that can be implemented by this learning process and the set of outcomes that can be implemented by a third-party mediator (in contrast to the situation in normal-form games).