Hierarchical Multi-agent Meta-Reinforcement Learning for Cross-channel Bidding

📅 2024-12-26

🏛️ IEEE Transactions on Knowledge and Data Engineering

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This paper addresses the challenge of cross-channel dynamic budget allocation and global performance optimization under shared budgets in multi-channel real-time bidding (RTB). Methodologically, it proposes a hierarchical multi-agent reinforcement learning framework featuring: (1) a novel CPC-constrained diffusion model for fine-grained, constraint-aware top-level budget allocation; (2) a state-action decoupled Actor-Critic architecture to mitigate offline policy extrapolation bias; and (3) a meta-knowledge-based cross-channel contextual representation enabling efficient knowledge transfer. Evaluated on Meituan’s large-scale industrial dataset, the framework significantly improves ROI while enhancing CPC constraint satisfaction rates. Its overall performance achieves state-of-the-art results in industry practice.

Technology Category

Application Category

📝 Abstract

Real-time bidding (RTB) plays a pivotal role in online advertising ecosystems. Advertisers employ strategic bidding to optimize their advertising impact while adhering to various financial constraints, such as the return-on-investment (ROI) and cost-per-click (CPC). Primarily focusing on bidding with fixed budget constraints, traditional approaches cannot effectively manage the dynamic budget allocation problem where the goal is to achieve global optimization of bidding performance across multiple channels with a shared budget. In this paper, we propose a hierarchical multi-agent reinforcement learning framework for multi-channel bidding optimization. In this framework, the top-level strategy applies a CPC constrained diffusion model to dynamically allocate budgets among the channels according to their distinct features and complex interdependencies, while the bottom-level strategy adopts a state-action decoupled actor-critic method to address the problem of extrapolation errors in offline learning caused by out-of-distribution actions and a context-based meta-channel knowledge learning method to improve the state representation capability of the policy based on the shared knowledge among different channels. Comprehensive experiments conducted on a large scale real-world industrial dataset from the Meituan ad bidding platform demonstrate that our method achieves a state-of-the-art performance.

Problem

Research questions and friction points this paper is trying to address.

Optimal Budget Allocation

Real-Time Bidding (RTB)

Cost Efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-layer Learning System

Budget Allocation Optimization

Cross-channel Knowledge Transfer

🔎 Similar Papers

No similar papers found.

Authors to Follow