🤖 AI Summary
This paper investigates the mechanisms by which independent Q-learning agents spontaneously collude to sustain high prices in infinitely repeated general and pricing games. Addressing the insufficient explanatory power of “black-box collusion” accounts, we identify two key human-driven factors: experience replay policies—particularly recency bias—and relative performance concerns. We design three experience replay variants and conduct robustness experiments within a heterogeneous multi-agent framework. Results show that randomized experience replay effectively suppresses collusion, driving outcomes toward the Bertrand equilibrium; conversely, recency-biased replay or incorporation of relative performance feedback markedly enhances both the stability and persistence of high-price collusion. This work provides the first systematic causal evidence of how experience replay and relative performance incentives shape algorithmic collusion formation. It thereby establishes actionable theoretical foundations for regulatory interventions—such as auditing replay mechanisms—to mitigate anti-competitive behavior in autonomous agent systems.
📝 Abstract
In an infinitely repeated pricing game, pricing algorithms based on artificial intelligence (Q-learning) may consistently learn to charge supra-competitive prices even without communication. Although concerns on algorithmic collusion have arisen, little is known on underlying factors. In this work, we experimentally analyze the dynamics of algorithms with three variants of experience replay. Algorithmic collusion still has roots in human preferences. Randomizing experience yields prices close to the static Bertrand equilibrium and higher prices are easily restored by favoring the latest experience. Moreover, relative performance concerns also stabilize the collusion. Finally, we investigate the scenarios with heterogeneous agents and test robustness on various factors.