Ensemble Learning via Knowledge Transfer for CTR Prediction

📅 2024-11-25

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

182K/year

🤖 AI Summary

In click-through rate (CTR) prediction, large-scale ensemble models suffer from performance degradation, subnetwork instability, and prediction bias—particularly when increasing the number of subnetworks, which paradoxically elevates variance and amplifies divergence between subnetwork and ensemble predictions. Method: To address this, we propose the Model-Agnostic Ensemble Knowledge Transfer Framework (EKTF), a novel knowledge distillation paradigm that treats the ensemble’s collective decision as an abstract “teacher.” EKTF integrates deep mutual learning, adaptive loss validation, and dual-path optimization—comprising teacher-student and peer-to-peer learning—to stabilize training and align subnetwork behaviors. Contribution/Results: Evaluated on five real-world CTR datasets, EKTF significantly improves prediction accuracy and robustness: subnetwork prediction variance is reduced by 32%–47%, and generalization consistently surpasses state-of-the-art ensemble methods. Crucially, EKTF overcomes the traditional scalability bottleneck of ensemble size, enabling larger, more effective ensembles without performance deterioration.

Technology Category

Application Category

📝 Abstract

Click-through rate (CTR) prediction plays a critical role in recommender systems and web searches. While many existing methods utilize ensemble learning to improve model performance, they typically limit the ensemble to two or three sub-networks, with little exploration of larger ensembles. In this paper, we investigate larger ensemble networks and find three inherent limitations in commonly used ensemble learning method: (1) performance degradation with more networks; (2) sharp decline and high variance in sub-network performance; (3) large discrepancies between sub-network and ensemble predictions. To simultaneously address the above limitations, this paper investigates potential solutions from the perspectives of Knowledge Distillation (KD) and Deep Mutual Learning (DML). Based on the empirical performance of these methods, we combine them to propose a novel model-agnostic Ensemble Knowledge Transfer Framework (EKTF). Specifically, we employ the collective decision-making of the students as an abstract teacher to guide each student (sub-network) towards more effective learning. Additionally, we encourage mutual learning among students to enable knowledge acquisition from different views. To address the issue of balancing the loss hyperparameters, we design a novel examination mechanism to ensure tailored teaching from teacher-to-student and selective learning in peer-to-peer. Experimental results on five real-world datasets demonstrate the effectiveness and compatibility of EKTF. The code, running logs, and detailed hyperparameter configurations are available at: https://github.com/salmon1802/EKTF.

Problem

Research questions and friction points this paper is trying to address.

Investigates performance degradation in large CTR ensemble networks

Addresses dimensional collapse in sub-networks via knowledge transfer

Proposes a scalable framework combining KD and DML methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge Distillation enhances CTR scaling

Deep Mutual Learning reduces variance

Combined framework improves ensemble stability

🔎 Similar Papers

No similar papers found.