Hidden Representation Clustering with Multi-Task Representation Learning towards Robust Online Budget Allocation

📅 2025-06-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the online marketing budget allocation problem in industrial settings, where data exhibit high noise levels, large scale, and uncontrolled quality. Departing from the conventional “prediction-optimization” paradigm, we propose a novel group-level stochastic integer programming framework driven by latent representation clustering. Our key contributions are: (1) learning user latent representations via a multi-task deep neural network and performing group modeling through partition-based clustering in the latent space; and (2) developing a deployable multi-class knowledge distillation model that balances robustness with millisecond-level inference latency. Offline experiments demonstrate significant improvements over six state-of-the-art methods. Online A/B tests on Meituan’s production system show a 0.53% increase in order volume and a 0.65% lift in GMV, validating the effectiveness and practicality of our approach in large-scale real-world systems.

Technology Category

Application Category

📝 Abstract
Marketing optimization, commonly formulated as an online budget allocation problem, has emerged as a pivotal factor in driving user growth. Most existing research addresses this problem by following the principle of 'first predict then optimize' for each individual, which presents challenges related to large-scale counterfactual prediction and solving complexity trade-offs. Note that the practical data quality is uncontrollable, and the solving scale tends to be tens of millions. Therefore, the existing approaches make the robust budget allocation non-trivial, especially in industrial scenarios with considerable data noise. To this end, this paper proposes a novel approach that solves the problem from the cluster perspective. Specifically, we propose a multi-task representation network to learn the inherent attributes of individuals and project the original features into high-dimension hidden representations through the first two layers of the trained network. Then, we divide these hidden representations into $K$ groups through partitioning-based clustering, thus reformulating the problem as an integer stochastic programming problem under different total budgets. Finally, we distill the representation module and clustering model into a multi-category model to facilitate online deployment. Offline experiments validate the effectiveness and superiority of our approach compared to six state-of-the-art marketing optimization algorithms. Online A/B tests on the Meituan platform indicate that the approach outperforms the online algorithm by 0.53% and 0.65%, considering order volume (OV) and gross merchandise volume (GMV), respectively.
Problem

Research questions and friction points this paper is trying to address.

Robust online budget allocation amid data noise
Large-scale counterfactual prediction complexity challenges
Clustering-based reformulation of stochastic programming problem
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-task representation network for attribute learning
Hidden representation clustering for problem reformulation
Multi-category model for online deployment
X
Xiaohan Wang
Meituan
Y
Yu Zhang
Meituan
G
Guibin Jiang
Meituan
Bing Cheng
Bing Cheng
The Chinese Academy of Science
machine learningartificial intelligencefinanceeconomics
W
Wei Lin
Meituan