Hierarchical Multi-agent Meta-Reinforcement Learning for Cross-channel Bidding

πŸ“… 2024-12-26
πŸ›οΈ IEEE Transactions on Knowledge and Data Engineering
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This paper addresses the challenge of cross-channel dynamic budget allocation and global performance optimization under shared budgets in multi-channel real-time bidding (RTB). Methodologically, it proposes a hierarchical multi-agent reinforcement learning framework featuring: (1) a novel CPC-constrained diffusion model for fine-grained, constraint-aware top-level budget allocation; (2) a state-action decoupled Actor-Critic architecture to mitigate offline policy extrapolation bias; and (3) a meta-knowledge-based cross-channel contextual representation enabling efficient knowledge transfer. Evaluated on Meituan’s large-scale industrial dataset, the framework significantly improves ROI while enhancing CPC constraint satisfaction rates. Its overall performance achieves state-of-the-art results in industry practice.

Technology Category

Application Category

πŸ“ Abstract
Real-time bidding (RTB) plays a pivotal role in online advertising ecosystems. Advertisers employ strategic bidding to optimize their advertising impact while adhering to various financial constraints, such as the return-on-investment (ROI) and cost-per-click (CPC). Primarily focusing on bidding with fixed budget constraints, traditional approaches cannot effectively manage the dynamic budget allocation problem where the goal is to achieve global optimization of bidding performance across multiple channels with a shared budget. In this paper, we propose a hierarchical multi-agent reinforcement learning framework for multi-channel bidding optimization. In this framework, the top-level strategy applies a CPC constrained diffusion model to dynamically allocate budgets among the channels according to their distinct features and complex interdependencies, while the bottom-level strategy adopts a state-action decoupled actor-critic method to address the problem of extrapolation errors in offline learning caused by out-of-distribution actions and a context-based meta-channel knowledge learning method to improve the state representation capability of the policy based on the shared knowledge among different channels. Comprehensive experiments conducted on a large scale real-world industrial dataset from the Meituan ad bidding platform demonstrate that our method achieves a state-of-the-art performance.
Problem

Research questions and friction points this paper is trying to address.

Optimal Budget Allocation
Real-Time Bidding (RTB)
Cost Efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-layer Learning System
Budget Allocation Optimization
Cross-channel Knowledge Transfer
πŸ”Ž Similar Papers
No similar papers found.
S
Shenghong He
School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
C
Chao Yu
School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
Qian Lin
Qian Lin
Research Engineer, ByteDance
DatabaseDistributed SystemData Streams
S
Shangqin Mao
Meituan, Beijing 100102, China
B
Bo Tang
School of Artificial Intelligence and Data Science, University of Science and Technology of China, Hefei 340101, China
Q
Qianlong Xie
Meituan, Beijing 100102, China
X
Xingxing Wang
Meituan, Beijing 100102, China