REMEDI: Relative Feature Enhanced Meta-Learning with Distillation for Imbalanced Prediction

📅 2025-05-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the dual challenges of extreme class imbalance (positive sample rate < 0.5%) and user behavior heterogeneity in vehicle purchase prediction. To tackle these, we propose a novel multi-stage modeling framework: (1) constructing diverse base models to capture behavioral heterogeneity; (2) designing innovative relative-performance meta-features—such as prediction bias and peer-ranking scores—to guide meta-learning-based ensemble fusion; and (3) applying business-objective-oriented supervised knowledge distillation to compress the ensemble into a lightweight, deployable single model. The approach synergistically integrates meta-learning, ensemble learning, knowledge distillation, and relative feature engineering. Evaluated on a real-world dataset of 800,000 car owners, our method achieves 10% precision in top-60,000 recommendations, covering 50% of actual buyers—substantially outperforming state-of-the-art baselines. The distilled model retains over 98% of the ensemble’s predictive performance while significantly improving inference efficiency and operational scalability.

Technology Category

Application Category

📝 Abstract
Predicting future vehicle purchases among existing owners presents a critical challenge due to extreme class imbalance (<0.5% positive rate) and complex behavioral patterns. We propose REMEDI (Relative feature Enhanced Meta-learning with Distillation for Imbalanced prediction), a novel multi-stage framework addressing these challenges. REMEDI first trains diverse base models to capture complementary aspects of user behavior. Second, inspired by comparative op-timization techniques, we introduce relative performance meta-features (deviation from ensemble mean, rank among peers) for effective model fusion through a hybrid-expert architecture. Third, we distill the ensemble's knowledge into a single efficient model via supervised fine-tuning with MSE loss, enabling practical deployment. Evaluated on approximately 800,000 vehicle owners, REMEDI significantly outperforms baseline approaches, achieving the business target of identifying ~50% of actual buyers within the top 60,000 recommendations at ~10% precision. The distilled model preserves the ensemble's predictive power while maintaining deployment efficiency, demonstrating REMEDI's effectiveness for imbalanced prediction in industry settings.
Problem

Research questions and friction points this paper is trying to address.

Addresses extreme class imbalance in vehicle purchase prediction
Enhances model fusion using relative performance meta-features
Distills ensemble knowledge into a single deployable model
Innovation

Methods, ideas, or system contributions that make the work stand out.

Diverse base models capture complementary user behavior
Relative performance meta-features enhance model fusion
Knowledge distillation into single efficient deployment model
🔎 Similar Papers
No similar papers found.
F
Fei Liu
China Automotive Technology & Research Center, Tianjin, 300300, China
H
Huanhuan Ren
China Automotive Technology & Research Center, Tianjin, 300300, China
Y
Yu Guan
China Automotive Technology & Research Center, Tianjin, 300300, China
X
Xiuxu Wang
China Automotive Technology & Research Center, Tianjin, 300300, China
W
Wang Lv
China Automotive Technology & Research Center, Tianjin, 300300, China
Z
Zhiqiang Hu
China Automotive Technology & Research Center, Tianjin, 300300, China
Yaxi Chen
Yaxi Chen
University College London
Medical Imaging