Dual-Stream MLP is All You Need for CTR Prediction

📅 2026-06-03
📈 Citations: 0
Influential: 0
📄 PDF

career value

168K/year
🤖 AI Summary
This work addresses the high complexity of feature interaction learning in click-through rate (CTR) prediction and the imbalance between outputs of explicit and implicit interaction modules. To this end, the authors propose a dual-stream MLP framework, wherein a backbone MLP absorbs knowledge from an explicit interaction module via knowledge distillation, while a parallel MLP captures implicit interactions. The two streams are jointly optimized through a dual-alignment strategy. Notably, this approach is the first to reduce a dual-stream architecture to a pure MLP structure, effectively curbing overfitting while lowering model complexity. The method achieves state-of-the-art performance on three mainstream CTR benchmark datasets, offering an efficient and scalable solution for recommender systems.
📝 Abstract
Click-through rate (CTR) prediction holds a pivotal role in online advertising and recommendation systems, where even small improvements can significantly boost revenue. Existing research primarily focuses on designing dual-stream architectures to capture effective complex feature interactions from both explicit and implicit perspectives. However, these approaches are faced with two major challenges: 1) the high complexity of feature interaction learning, which increases computational demands and the overfitting risk, and 2) the imbalance between explicit and implicit modules, where one module's output may dominate the final prediction. To address these issues, in this paper, we propose Dual-Stream MLP (DS-MLP), a novel feature interaction framework for the CTR prediction task. Specially, it leverages knowledge distillation to consolidate the capacity of learning explicit feature interaction into a main MLP network, while a parallel MLP simultaneously captures implicit feature interactions as a complement. To effectively optimize the dual-stream MLP architecture, we further design a specific learning approach with two alignment strategies for enhancing the compatibility of the two MLP components. Experiments demonstrate that DS-MLP, though merely a vanilla MLP structure (the final model), can achieve state-of-the-art performance across three widely used benchmarks, offering a scalable and efficient solution for large-scale recommendation systems. Our code is available at https://github.com/RUCAIBox/DS-MLP.
Problem

Research questions and friction points this paper is trying to address.

CTR prediction
feature interaction
dual-stream architecture
overfitting
module imbalance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-Stream MLP
Knowledge Distillation
Feature Interaction
CTR Prediction
Alignment Strategy
🔎 Similar Papers
No similar papers found.