🤖 AI Summary
This work addresses the misalignment between offline and online metrics in advertising ranking systems, commonly caused by inconsistent score scales and position bias in multi-objective optimization. The authors propose a unified training framework that, for the first time, directly incorporates score calibration into the optimization objective rather than relying on post-processing. The approach employs Lagrangian relaxation to formulate constrained multi-objective optimization and integrates variance-reduced counterfactual utility estimation to enhance robustness. This method significantly improves ranking consistency and generalization across traffic segments. Experiments on the Criteo and Avazu datasets demonstrate a 1.1% relative AUC improvement over the strongest baseline, PairRank, along with a 31.6% reduction in calibration error and a 3.2% gain in utility.
📝 Abstract
Ad ranking systems must simultaneously optimize multiple objectives including click-through rate (CTR), conversion rate (CVR), revenue, and user experience metrics. However, production systems face critical challenges: score scale inconsistency across traffic segments undermines threshold transferability, and position bias in click logs causes offline-online metric discrepancies. We propose CaliCausalRank, a unified framework that integrates training-time scale calibration, constraint-based multi-objective optimization, and robust counterfactual utility estimation. Our approach treats score calibration as a first-class training objective rather than post-hoc processing, employs Lagrangian relaxation for constraint satisfaction, and utilizes variance-reduced counterfactual estimators for reliable offline evaluation. Experiments on the Criteo and Avazu datasets demonstrate that CaliCausalRank achieves 1.1% relative AUC improvement, 31.6% calibration error reduction, and 3.2% utility gain compared to the best baseline (PairRank) while maintaining consistent performance across different traffic segments.