Image Fusion for Cross-Domain Sequential Recommendation

📅 2024-12-31

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

Cross-domain sequential recommendation (CDSR) faces the key challenge of jointly modeling intra-sequence and inter-sequence item interactions to accurately capture users’ dynamic cross-domain preferences. To address this, we propose the first CDSR framework incorporating a frozen CLIP image encoder, introducing an image-enhanced multi-attention mechanism. Specifically, fine-grained visual semantics of items are captured via CLIP’s visual embeddings; image–text feature fusion enables cross-domain visual semantic alignment; and a hierarchical cross-domain attention module unifies modeling of intra-sequence local dependencies and inter-sequence cross-domain associations. Extensive experiments on four reconstructed e-commerce datasets demonstrate that our method significantly outperforms state-of-the-art approaches, achieving a 12.6% improvement in Recall@10. These results validate both the effectiveness and advancement of leveraging visual information for modeling cross-domain user preferences.

Technology Category

Application Category

📝 Abstract

Cross-Domain Sequential Recommendation (CDSR) aims to predict future user interactions based on historical interactions across multiple domains. The key challenge in CDSR is effectively capturing cross-domain user preferences by fully leveraging both intra-sequence and inter-sequence item interactions. In this paper, we propose a novel method, Image Fusion for Cross-Domain Sequential Recommendation (IFCDSR), which incorporates item image information to better capture visual preferences. Our approach integrates a frozen CLIP model to generate image embeddings, enriching original item embeddings with visual data from both intra-sequence and inter-sequence interactions. Additionally, we employ a multiple attention layer to capture cross-domain interests, enabling joint learning of single-domain and cross-domain user preferences. To validate the effectiveness of IFCDSR, we re-partitioned four e-commerce datasets and conducted extensive experiments. Results demonstrate that IFCDSR significantly outperforms existing methods.

Problem

Research questions and friction points this paper is trying to address.

Predict future user interactions across domains.

Capture cross-domain user preferences effectively.

Enhance recommendation with item image information.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates CLIP model for image embeddings

Uses multiple attention for cross-domain interests

Enhances item embeddings with visual data

🔎 Similar Papers

Harnessing Multimodal Large Language Models for Multimodal Sequential Recommendation