🤖 AI Summary
Existing automated bidding services struggle to simultaneously maximize platform social welfare and advertisers’ individual utilities, while neglecting Nash equilibrium constraints. This paper formulates the Nash-constrained bidding (NCB) problem, adopting the ε-Nash equilibrium as the solution concept to enhance social welfare without compromising advertisers’ strategic stability. We propose the first automated bidding paradigm explicitly incorporating Nash equilibrium constraints. To solve NCB, we design a bilevel policy gradient (BPG) framework whose computational complexity is independent of the number of advertisers, and establish its theoretical convergence guarantee. Extensive experiments on both synthetic and real-world advertising datasets demonstrate that our method significantly improves both social welfare and individual return-on-investment (ROI), exhibits stable convergence, scales effectively with system size, and achieves a strong balance between theoretical rigor and industrial applicability.
📝 Abstract
Many online advertising platforms provide advertisers with auto-bidding services to enhance their advertising performance. However, most existing auto-bidding algorithms fail to accurately capture the auto-bidding problem formulation that the platform truly faces, let alone solve it. Actually, we argue that the platform should try to help optimize each advertiser's performance to the greatest extent -- which makes $epsilon$-Nash Equilibrium ($epsilon$-NE) a necessary solution concept -- while maximizing the social welfare of all the advertisers for the platform's long-term value. Based on this, we introduce the emph{Nash-Equilibrium Constrained Bidding} (NCB), a new formulation of the auto-bidding problem from the platform's perspective. Specifically, it aims to maximize the social welfare of all advertisers under the $epsilon$-NE constraint. However, the NCB problem presents significant challenges due to its constrained bi-level structure and the typically large number of advertisers involved. To address these challenges, we propose a emph{Bi-level Policy Gradient} (BPG) framework with theoretical guarantees. Notably, its computational complexity is independent of the number of advertisers, and the associated gradients are straightforward to compute. Extensive simulated and real-world experiments validate the effectiveness of the BPG framework.