🤖 AI Summary
This work addresses the challenge of effectively integrating external collaborative signals with internal semantic representations in large language model (LLM)-based recommender systems, where existing approaches suffer from information imbalance during inference. To resolve the depth-dependent utilization and alignment sensitivity of collaborative embeddings in LLMs, the authors propose a dual-side semantic alignment strategy—aligning item textual content with user semantic profiles—together with a hierarchical attention guidance mechanism that suppresses collaborative interference in shallow layers while reinforcing collaborative evidence in deeper layers. The framework constructs codebook-based user profiles, aligns embeddings with textual semantics, and employs layered attention control to enable end-to-end recommendation. Experiments on MovieLens-1M and Amazon-Book demonstrate significant performance gains over state-of-the-art baselines, and ablation studies confirm the contribution of each proposed component.
📝 Abstract
Recent LLM-based recommenders enhance language models with collaborative embeddings from user-item interactions, but making such embeddings available does not ensure their proper use during inference. Through a diagnostic attention analysis, we find that the utilization of collaborative embeddings is depth-dependent and alignment-sensitive, suggesting that LLMs need to balance their internal semantic knowledge with external collaborative knowledge. To address this issue, we propose SAILRec, an LLM-based recommender that improves this balance through dual-side semantic alignment and hierarchical attention steering. The former aligns item-side embeddings with item-text semantics and user-side embeddings with codebook-based semantic profiles, while the latter suppresses premature shallow-layer collaborative interference and strengthens collaborative evidence in deeper decision layers. Experiments on MovieLens-1M and Amazon-Book show that SAILRec consistently outperforms representative baselines, with ablation and masking analyses validating its key designs.