🤖 AI Summary
In click-through rate (CTR) prediction, large-scale ensemble models suffer from performance degradation, subnetwork instability, and prediction bias—particularly when increasing the number of subnetworks, which paradoxically elevates variance and amplifies divergence between subnetwork and ensemble predictions.
Method: To address this, we propose the Model-Agnostic Ensemble Knowledge Transfer Framework (EKTF), a novel knowledge distillation paradigm that treats the ensemble’s collective decision as an abstract “teacher.” EKTF integrates deep mutual learning, adaptive loss validation, and dual-path optimization—comprising teacher-student and peer-to-peer learning—to stabilize training and align subnetwork behaviors.
Contribution/Results: Evaluated on five real-world CTR datasets, EKTF significantly improves prediction accuracy and robustness: subnetwork prediction variance is reduced by 32%–47%, and generalization consistently surpasses state-of-the-art ensemble methods. Crucially, EKTF overcomes the traditional scalability bottleneck of ensemble size, enabling larger, more effective ensembles without performance deterioration.
📝 Abstract
Click-through rate (CTR) prediction plays a critical role in recommender systems and web searches. While many existing methods utilize ensemble learning to improve model performance, they typically limit the ensemble to two or three sub-networks, with little exploration of larger ensembles. In this paper, we investigate larger ensemble networks and find three inherent limitations in commonly used ensemble learning method: (1) performance degradation with more networks; (2) sharp decline and high variance in sub-network performance; (3) large discrepancies between sub-network and ensemble predictions. To simultaneously address the above limitations, this paper investigates potential solutions from the perspectives of Knowledge Distillation (KD) and Deep Mutual Learning (DML). Based on the empirical performance of these methods, we combine them to propose a novel model-agnostic Ensemble Knowledge Transfer Framework (EKTF). Specifically, we employ the collective decision-making of the students as an abstract teacher to guide each student (sub-network) towards more effective learning. Additionally, we encourage mutual learning among students to enable knowledge acquisition from different views. To address the issue of balancing the loss hyperparameters, we design a novel examination mechanism to ensure tailored teaching from teacher-to-student and selective learning in peer-to-peer. Experimental results on five real-world datasets demonstrate the effectiveness and compatibility of EKTF. The code, running logs, and detailed hyperparameter configurations are available at: https://github.com/salmon1802/EKTF.