Hypercomplex Prompt-aware Multimodal Recommendation

📅 2025-08-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address three key challenges in multimodal recommendation—(1) limited representation diversity, (2) insufficient modeling of nonlinear cross-modal interactions, and (3) over-smoothing in graph convolutional networks (GCNs)—this paper proposes PromptHyp, a prompt-aware multimodal recommendation framework based on hypercomplex numbers. Specifically, PromptHyp (i) employs multi-component hypercomplex embeddings to enrich multimodal feature diversity; (ii) explicitly captures deep nonlinear cross-modal interactions via hypercomplex multiplication; and (iii) introduces a prompt-aware compensation mechanism to alleviate feature misalignment and modality-specific information loss while mitigating GCN over-smoothing. Furthermore, it integrates self-supervised learning for end-to-end optimization. Extensive experiments on four public benchmarks demonstrate that PromptHyp consistently outperforms state-of-the-art methods, validating its effectiveness in multimodal representation alignment and generalization.

Technology Category

Application Category

📝 Abstract
Modern recommender systems face critical challenges in handling information overload while addressing the inherent limitations of multimodal representation learning. Existing methods suffer from three fundamental limitations: (1) restricted ability to represent rich multimodal features through a single representation, (2) existing linear modality fusion strategies ignore the deep nonlinear correlations between modalities, and (3) static optimization methods failing to dynamically mitigate the over-smoothing problem in graph convolutional network (GCN). To overcome these limitations, we propose HPMRec, a novel Hypercomplex Prompt-aware Multimodal Recommendation framework, which utilizes hypercomplex embeddings in the form of multi-components to enhance the representation diversity of multimodal features. HPMRec adopts the hypercomplex multiplication to naturally establish nonlinear cross-modality interactions to bridge semantic gaps, which is beneficial to explore the cross-modality features. HPMRec also introduces the prompt-aware compensation mechanism to aid the misalignment between components and modality-specific features loss, and this mechanism fundamentally alleviates the over-smoothing problem. It further designs self-supervised learning tasks that enhance representation diversity and align different modalities. Extensive experiments on four public datasets show that HPMRec achieves state-of-the-art recommendation performance.
Problem

Research questions and friction points this paper is trying to address.

Enhancing multimodal feature representation diversity
Establishing nonlinear cross-modality interaction correlations
Dynamically mitigating graph convolutional network over-smoothing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hypercomplex embeddings for multimodal feature representation
Hypercomplex multiplication for nonlinear cross-modality interactions
Prompt-aware compensation mechanism to alleviate over-smoothing
🔎 Similar Papers
No similar papers found.
Zheyu Chen
Zheyu Chen
PhD, Beijing Institute of Technology
Recommendation System
J
Jinfeng Xu
The University of Hong Kong
Hewei Wang
Hewei Wang
Carnegie Mellon University / Apple
Machine LearningComputer VisionVision-Language ModelsGenerative Models
S
Shuo Yang
The University of Hong Kong
Z
Zitong Wan
University College Dublin
H
Haibo Hu
The Hong Kong Polytechnic University