Diverse Negative Sampling for Implicit Collaborative Filtering

📅 2025-08-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In implicit collaborative filtering, existing negative sampling strategies tend to oversample in dense regions of user-item interactions, leading to homogeneous negative instances that impair model expressiveness and generalization. To address this, we propose DivNS—a novel negative sampling framework that explicitly models diversity for the first time. DivNS comprises three stages: user-specific cache management, diversity-aware sampling, and synthetic negative instance generation—collectively enhancing coverage breadth and discriminative power of negative samples. It requires no additional annotations and seamlessly integrates with mainstream implicit CF models. Extensive experiments on four public benchmarks demonstrate that DivNS consistently improves Recall@K and NDCG@K by an average of 3.2% over strong baselines—including BPR, IPS, and SLIME—while maintaining computationally efficient overhead.

Technology Category

Application Category

📝 Abstract
Implicit collaborative filtering recommenders are usually trained to learn user positive preferences. Negative sampling, which selects informative negative items to form negative training data, plays a crucial role in this process. Since items are often clustered in the latent space, existing negative sampling strategies normally oversample negative items from the dense regions. This leads to homogeneous negative data and limited model expressiveness. In this paper, we propose Diverse Negative Sampling (DivNS), a novel approach that explicitly accounts for diversity in negative training data during the negative sampling process. DivNS first finds hard negative items with large preference scores and constructs user-specific caches that store unused but highly informative negative samples. Then, its diversity-augmented sampler selects a diverse subset of negative items from the cache while ensuring dissimilarity from the user's hard negatives. Finally, a synthetic negatives generator combines the selected diverse negatives with hard negatives to form more effective training data. The resulting synthetic negatives are both informative and diverse, enabling recommenders to learn a broader item space and improve their generalisability. Extensive experiments on four public datasets demonstrate the effectiveness of DivNS in improving recommendation quality while maintaining computational efficiency.
Problem

Research questions and friction points this paper is trying to address.

Addresses homogeneous negative data in implicit collaborative filtering
Improves model expressiveness through diverse negative sampling
Enhances recommendation quality by broadening learned item space
Innovation

Methods, ideas, or system contributions that make the work stand out.

Diverse negative sampling from user-specific caches
Combining hard negatives with diverse synthetic negatives
Ensuring dissimilarity from user's hard negative items
🔎 Similar Papers
No similar papers found.