🤖 AI Summary
To address three key challenges in multimodal recommendation—(1) limited representation diversity, (2) insufficient modeling of nonlinear cross-modal interactions, and (3) over-smoothing in graph convolutional networks (GCNs)—this paper proposes PromptHyp, a prompt-aware multimodal recommendation framework based on hypercomplex numbers. Specifically, PromptHyp (i) employs multi-component hypercomplex embeddings to enrich multimodal feature diversity; (ii) explicitly captures deep nonlinear cross-modal interactions via hypercomplex multiplication; and (iii) introduces a prompt-aware compensation mechanism to alleviate feature misalignment and modality-specific information loss while mitigating GCN over-smoothing. Furthermore, it integrates self-supervised learning for end-to-end optimization. Extensive experiments on four public benchmarks demonstrate that PromptHyp consistently outperforms state-of-the-art methods, validating its effectiveness in multimodal representation alignment and generalization.
📝 Abstract
Modern recommender systems face critical challenges in handling information overload while addressing the inherent limitations of multimodal representation learning. Existing methods suffer from three fundamental limitations: (1) restricted ability to represent rich multimodal features through a single representation, (2) existing linear modality fusion strategies ignore the deep nonlinear correlations between modalities, and (3) static optimization methods failing to dynamically mitigate the over-smoothing problem in graph convolutional network (GCN). To overcome these limitations, we propose HPMRec, a novel Hypercomplex Prompt-aware Multimodal Recommendation framework, which utilizes hypercomplex embeddings in the form of multi-components to enhance the representation diversity of multimodal features. HPMRec adopts the hypercomplex multiplication to naturally establish nonlinear cross-modality interactions to bridge semantic gaps, which is beneficial to explore the cross-modality features. HPMRec also introduces the prompt-aware compensation mechanism to aid the misalignment between components and modality-specific features loss, and this mechanism fundamentally alleviates the over-smoothing problem. It further designs self-supervised learning tasks that enhance representation diversity and align different modalities. Extensive experiments on four public datasets show that HPMRec achieves state-of-the-art recommendation performance.