A Comprehensive Survey on Self-Supervised Learning for Recommendation

📅 2024-04-04

🏛️ ACM Computing Surveys

📈 Citations: 21

✨ Influential: 1

career value

190K/year

🤖 AI Summary

To address the insufficient representation learning capability of supervised methods under data sparsity in recommender systems, this work presents a comprehensive survey of self-supervised learning (SSL) for recommendation, synthesizing over 170 papers across nine application scenarios and three major paradigms: contrastive learning, generative learning, and adversarial learning. We propose the first unified SSL-based recommendation taxonomy covering all scenarios and paradigms; establish and actively maintain an open-source repository to foster standardization and reproducibility. Furthermore, we introduce a unified framework integrating contrastive learning, generative modeling, and adversarial training—compatible with RNN, GNN, and Transformer architectures—to effectively model sparse user behavioral sequences. Our approach significantly improves cold-start and long-tail recommendation performance. The study yields a structured knowledge graph and an authoritative open-source resource list, both widely adopted in academic instruction and industrial deployment.

Technology Category

Application Category

📝 Abstract

Recommender systems play a crucial role in tackling the challenge of information overload by delivering personalized recommendations based on individual user preferences. Deep learning techniques, such as RNNs, GNNs, and Transformer architectures, have significantly propelled the advancement of recommender systems by enhancing their comprehension of user behaviors and preferences. However, supervised learning methods encounter challenges in real-life scenarios due to data sparsity, resulting in limitations in their ability to learn representations effectively. To address this, self-supervised learning (SSL) techniques have emerged as a solution, leveraging inherent data structures to generate supervision signals without relying solely on labeled data. By leveraging unlabeled data and extracting meaningful representations, recommender systems utilizing SSL can make accurate predictions and recommendations even when confronted with data sparsity. In this article, we provide a comprehensive review of self-supervised learning frameworks designed for recommender systems, encompassing a thorough analysis of over 170 papers. We conduct an exploration of nine distinct scenarios, enabling a comprehensive understanding of SSL-enhanced recommenders in different contexts. For each domain, we elaborate on different self-supervised learning paradigms, namely contrastive learning, generative learning, and adversarial learning, so as to present technical details of how SSL enhances recommender systems in various contexts. We consistently maintain the related open-source materials at https://github.com/HKUDS/Awesome-SSLRec-Papers.

Problem

Research questions and friction points this paper is trying to address.

Addresses data sparsity in recommender systems via self-supervised learning

Surveys SSL frameworks like contrastive, generative, and adversarial learning

Reviews over 170 papers across nine scenarios to enhance recommendations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised learning leverages data structures for supervision

SSL uses contrastive, generative, and adversarial learning paradigms

SSL addresses data sparsity by extracting meaningful representations from unlabeled data

🔎 Similar Papers

Review-based Recommender Systems: A Survey of Approaches, Challenges and Future Perspectives