Privacy Preserving Conversion Modeling in Data Clean Room

📅 2024-10-08
🏛️ ACM Conference on Recommender Systems
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
In data clean room settings, CVR prediction faces dual constraints: stringent user privacy protection and the requirement that advertisers’ data remain within their own domain. To address this, we propose the first collaborative training framework integrating batch-level gradient aggregation, Adapter-based efficient fine-tuning, and label differential privacy with bias mitigation. Without sharing raw labels or model parameters, our method enables cross-domain joint modeling via gradient-level collaboration: batch-wise gradient aggregation ensures regulatory compliance; lightweight Adapters enable low-overhead domain adaptation; and bias-corrected label differential privacy mitigates estimation bias induced by noise injection. Evaluated on industrial datasets, our approach achieves state-of-the-art ROC-AUC performance while reducing communication overhead by 62%. It strictly adheres to GDPR and other privacy regulations, fulfilling practical commercial deployment requirements.

Technology Category

Application Category

📝 Abstract
In the realm of online advertising, accurately predicting the conversion rate (CVR) is crucial for enhancing advertising efficiency and user satisfaction. This paper addresses the challenge of CVR prediction while adhering to user privacy preferences and advertiser requirements. Traditional methods face obstacles such as the reluctance of advertisers to share sensitive conversion data and the limitations of model training in secure environments like data clean rooms. We propose a novel model training framework that enables collaborative model training without sharing sample-level gradients with the advertising platform. Our approach introduces several innovative components: (1) utilizing batch-level aggregated gradients instead of sample-level gradients to minimize privacy risks; (2) applying adapter-based parameter-efficient fine-tuning and gradient compression to reduce communication costs; and (3) employing de-biasing techniques to train the model under label differential privacy, thereby maintaining accuracy despite privacy-enhanced label perturbations. Our experimental results, conducted on industrial datasets, demonstrate that our method achieves competitive ROC-AUC performance while significantly decreasing communication overhead and complying with both advertisers’ privacy requirements and user privacy choices. This framework establishes a new standard for privacy-preserving, high-performance CVR prediction in the digital advertising landscape.
Problem

Research questions and friction points this paper is trying to address.

Predicting conversion rates while preserving user privacy
Overcoming data sharing reluctance in secure environments
Reducing communication costs in privacy-preserving model training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses batch-level aggregated gradients for privacy
Applies adapter-based fine-tuning and gradient compression
Employs de-biasing under label differential privacy
🔎 Similar Papers
No similar papers found.
K
Kungang Li
Pinterest Inc., USA
Xiangyi Chen
Xiangyi Chen
Pinterest
machine learningoptimizationdifferential privacyartificial intelligence
L
Ling Leng
Pinterest Inc., USA
Jiajing Xu
Jiajing Xu
Pinterest
Recommendation systemInformation retrievalDeep learning
J
Jiankai Sun
Pinterest Inc., USA
B
Behnam Rezaei
Pinterest, USA (Currently at Roblox. Work was done while the author was employed at Pinterest Inc.)