🤖 AI Summary
Collaborative filtering in related video recommendation often suffers from semantic incoherence and severe popularity bias. To address this, we propose a multi-objective retrieval framework—the first to explicitly co-optimize semantic relevance and user engagement while mitigating popularity bias in an industrial-scale system. Methodologically, it employs a dual-tower architecture integrating multi-task learning, cross-modal (textual + visual) content embeddings, and off-policy debiasing via inverse propensity weighting. A/B testing demonstrates a 12-percentage-point improvement in topic match rate (51% → 63%), a 13.8% reduction in the proportion of popular videos, and a 0.04% absolute gain in core user engagement metrics. The framework achieves a balanced trade-off among semantic quality, recommendation diversity, and system scalability, establishing a deployable paradigm for joint semantic–behavioral optimization in related video recommendation.
📝 Abstract
Related video recommendations commonly use collaborative filtering (CF) driven by co-engagement signals, often resulting in recommendations lacking semantic coherence and exhibiting strong popularity bias. This paper introduces a novel multi-objective retrieval framework, enhancing standard two-tower models to explicitly balance semantic relevance and user engagement. Our approach uniquely combines: (a) multi-task learning (MTL) to jointly optimize co-engagement and semantic relevance, explicitly prioritizing topical coherence; (b) fusion of multimodal content features (textual and visual embeddings) for richer semantic understanding; and (c) off-policy correction (OPC) via inverse propensity weighting to effectively mitigate popularity bias. Evaluation on industrial-scale data and a two-week live A/B test reveals our framework's efficacy. We observed significant improvements in semantic relevance (from 51% to 63% topic match rate), a reduction in popular item distribution (-13.8% popular video recommendations), and a +0.04% improvement in our topline user engagement metric. Our method successfully achieves better semantic coherence, balanced engagement, and practical scalability for real-world deployment.