Nash Equilibrium Constrained Auto-bidding With Bi-level Reinforcement Learning

📅 2025-03-13

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Existing automated bidding services struggle to simultaneously maximize platform social welfare and advertisers’ individual utilities, while neglecting Nash equilibrium constraints. This paper formulates the Nash-constrained bidding (NCB) problem, adopting the ε-Nash equilibrium as the solution concept to enhance social welfare without compromising advertisers’ strategic stability. We propose the first automated bidding paradigm explicitly incorporating Nash equilibrium constraints. To solve NCB, we design a bilevel policy gradient (BPG) framework whose computational complexity is independent of the number of advertisers, and establish its theoretical convergence guarantee. Extensive experiments on both synthetic and real-world advertising datasets demonstrate that our method significantly improves both social welfare and individual return-on-investment (ROI), exhibits stable convergence, scales effectively with system size, and achieves a strong balance between theoretical rigor and industrial applicability.

Technology Category

Application Category

📝 Abstract

Many online advertising platforms provide advertisers with auto-bidding services to enhance their advertising performance. However, most existing auto-bidding algorithms fail to accurately capture the auto-bidding problem formulation that the platform truly faces, let alone solve it. Actually, we argue that the platform should try to help optimize each advertiser's performance to the greatest extent -- which makes $epsilon$-Nash Equilibrium ($epsilon$-NE) a necessary solution concept -- while maximizing the social welfare of all the advertisers for the platform's long-term value. Based on this, we introduce the emph{Nash-Equilibrium Constrained Bidding} (NCB), a new formulation of the auto-bidding problem from the platform's perspective. Specifically, it aims to maximize the social welfare of all advertisers under the $epsilon$-NE constraint. However, the NCB problem presents significant challenges due to its constrained bi-level structure and the typically large number of advertisers involved. To address these challenges, we propose a emph{Bi-level Policy Gradient} (BPG) framework with theoretical guarantees. Notably, its computational complexity is independent of the number of advertisers, and the associated gradients are straightforward to compute. Extensive simulated and real-world experiments validate the effectiveness of the BPG framework.

Problem

Research questions and friction points this paper is trying to address.

Optimize advertiser performance using ε-Nash Equilibrium.

Maximize social welfare under ε-NE constraints.

Propose Bi-level Policy Gradient for scalable solution.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Nash-Equilibrium Constrained Bidding (NCB) formulation

Bi-level Policy Gradient (BPG) framework

Computational complexity independent of advertisers

🔎 Similar Papers

No similar papers found.

Amazon

172,400.00 - 223,400.00 USD annually

USA, NY, New York

Machine Learning Engineer, Commerce Ads Ranking

TikTok

San Jose, California

Research Engineer, Monetization AI