PromptHash: Affinity-Prompted Collaborative Cross-Modal Learning for Adaptive Hashing Retrieval

📅 2025-03-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address semantic distortion, contextual fragmentation, and information redundancy in cross-modal hashing, this paper proposes an affinity-aware prompt-based collaborative learning framework. Methodologically, it introduces: (1) a novel affinity prompt perception mechanism to guide fine-grained text–image semantic alignment; (2) a hierarchical contrastive alignment strategy to mitigate modality heterogeneity; and (3) a gated adaptive feature fusion architecture integrating State Space Models (SSMs) and Transformers, enabling end-to-end differentiable hash optimization. Evaluated on NUS-WIDE, the method achieves new state-of-the-art performance, improving mean Average Precision (mAP) by 18.22% for image-to-text retrieval and 18.65% for text-to-image retrieval—substantially outperforming existing approaches.

Technology Category

Application Category

📝 Abstract
Cross-modal hashing is a promising approach for efficient data retrieval and storage optimization. However, contemporary methods exhibit significant limitations in semantic preservation, contextual integrity, and information redundancy, which constrains retrieval efficacy. We present PromptHash, an innovative framework leveraging affinity prompt-aware collaborative learning for adaptive cross-modal hashing. We propose an end-to-end framework for affinity-prompted collaborative hashing, with the following fundamental technical contributions: (i) a text affinity prompt learning mechanism that preserves contextual information while maintaining parameter efficiency, (ii) an adaptive gated selection fusion architecture that synthesizes State Space Model with Transformer network for precise cross-modal feature integration, and (iii) a prompt affinity alignment strategy that bridges modal heterogeneity through hierarchical contrastive learning. To the best of our knowledge, this study presents the first investigation into affinity prompt awareness within collaborative cross-modal adaptive hash learning, establishing a paradigm for enhanced semantic consistency across modalities. Through comprehensive evaluation on three benchmark multi-label datasets, PromptHash demonstrates substantial performance improvements over existing approaches. Notably, on the NUS-WIDE dataset, our method achieves significant gains of 18.22% and 18.65% in image-to-text and text-to-image retrieval tasks, respectively. The code is publicly available at https://github.com/ShiShuMo/PromptHash.
Problem

Research questions and friction points this paper is trying to address.

Enhances semantic preservation in cross-modal hashing
Improves contextual integrity and reduces information redundancy
Develops adaptive hashing retrieval with affinity-prompted collaborative learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Text affinity prompt learning mechanism
Adaptive gated selection fusion architecture
Prompt affinity alignment strategy
🔎 Similar Papers
No similar papers found.
Qiang Zou
Qiang Zou
Assistant Professor, State Key Lab of CAD&CG, ZJU
Geometric ModelingPhysical ModelingCAD/CAM
S
Shuli Cheng
School of Computer Science and Technology, Xinjiang University, China
J
Jiayi Chen
School of Computer Science and Technology, Xinjiang University, China