DiffCold: A Diffusion-based Generative Model for Cold-Start Item Recommendation

📅 2026-06-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the performance degradation of cold-start items in recommender systems due to their lack of interaction history, as well as the “seesaw dilemma” wherein improving cold-item performance often compromises that of popular (hot) items. To this end, the authors propose DiffCold—the first framework to incorporate diffusion models into cold-start recommendation. DiffCold employs a conditional diffusion mechanism to reconstruct hot-item embeddings from item content features and integrates a retrieval-augmented aggregator with a simulation-based representation alignment module to unify the embedding spaces of cold and hot items, thereby mitigating manifold inconsistency. Extensive experiments on three benchmark datasets demonstrate that DiffCold consistently outperforms state-of-the-art methods, achieving simultaneous improvements on both cold-start and general recommendation metrics, effectively resolving the seesaw dilemma.
📝 Abstract
Cold-start item recommendation remains a persistent challenge in real-world systems due to the absence of interaction histories. While prior models attempt to bridge this gap using item content features, they universally suffer from the \textbf{seesaw dilemma}: enhancing performance for cold items inevitably degrades performance for warm items, and vice versa. We identify that this dilemma stems from a fundamental \textbf{distributional disparity}: warm item embeddings occupy a complex ``behavioral manifold" shaped by rich interaction signals, whereas cold item embeddings are constrained to a ``semantic manifold" derived solely from auxiliary content. Existing methods often force a rigid mapping between these inconsistent spaces, causing the model to sacrifice the precision of warm representations to accommodate cold ones. To address this, we propose \textbf{DiffCold}, a diffusion-based generative model that unifies warm and cold representations. Unlike GANs or VAEs, DiffCold leverages conditional diffusion to reconstruct warm item embeddings from content, preserving the underlying manifold structure without degradation. We further tailor this paradigm with two specific designs: a \textbf{Retrieval-enhanced Aggregator} that initializes generation using semantically similar warm items to bypass inefficient noise, and a \textbf{Simulation-based Representation Alignment} module that enforces distribution consistency between generated and real embeddings via contrastive learning. Experiments on three benchmarks confirm that DiffCold resolves the seesaw dilemma, consistently outperforming state-of-the-art methods across all metrics.
Problem

Research questions and friction points this paper is trying to address.

cold-start recommendation
seesaw dilemma
distributional disparity
item embedding
behavioral manifold
Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion-based generative model
cold-start recommendation
manifold alignment
retrieval-enhanced aggregator
representation consistency
🔎 Similar Papers