🤖 AI Summary
This work addresses the challenge of entangled cross-domain and intra-domain user intents in multi-domain session-based recommendation by proposing a novel three-encoder architecture. It introduces, for the first time, an orthogonal preference decomposition strategy that explicitly disentangles user preferences into domain-specific, domain-common, and sequence-exclusive components. A dynamic gating mechanism generates temporally adaptive session representations, while domain-masked objectives, gradient reversal layers for adversarial training, and constraints enforcing representation alignment and independence collectively enable clear disentanglement and interpretable fusion of multi-domain intents. Evaluated on two large-scale real-world multi-domain datasets, the proposed method significantly outperforms state-of-the-art baselines and offers interpretable insights into cross-domain preference interactions.
📝 Abstract
Capturing user intent across heterogeneous behavioral domains stands as a fundamental challenge in session-based recommender systems. Yet, existing multi-domain approaches frequently fail to isolate the distinct contribution of cross-domain interactions from those arising within individual domains, limiting their ability to build rich and transferable user representations. In this work, we propose MOSAIC, a Multi-Domain Orthogonal Session Adaptive Intent Capture framework that explicitly factorizes user preferences into three orthogonal components: domain-specific, domain-common, and cross-sequence-exclusive representations. Our approach employs a triple-encoder architecture, where each encoder is dedicated to one preference type, enforced through domain masking objectives and adversarial training via a gradient reversal layer. Representational alignment and mutual independence constraints are jointly optimized to ensure clean preference separation. Additionally, a dynamic gating mechanism modulates the relative contribution of each component at every timestep, yielding a unified and temporally adaptive session-level user representation. We conduct extensive experiments on two large-scale real-world benchmarks spanning multiple domains and interaction types. The ablation study validates that each component domain-specific encoding, domain-common modeling, cross-sequence representation, and dynamic gating contributes meaningfully to the overall performance. Experimental results demonstrate that MOSAIC consistently outperforms state-of-the-art baselines in recommendation accuracy, while simultaneously providing interpretable insights into the interplay between domain-specific and cross-domain preference signals. These findings highlight the potential of orthogonal preference decomposition as a principled strategy for next-generation multi-domain recommender systems.