π€ AI Summary
This work addresses the challenge in federated graph learning where missing modalities (e.g., only images or text available) at certain clients hinder effective cross-modal representation learning. To tackle this issue, the authors propose PRISM, a novel framework that uniquely integrates federated cross-modal completion with graph topology. PRISM introduces a structural meta-prompt mechanism that retrieves missing modality semantics from the federated system under topological awareness and seamlessly incorporates them into local message-passing processes on the graph. This design effectively mitigates the amplification of semantic bias caused by local modality incompleteness. Extensive experiments demonstrate that PRISM achieves an average performance gain of 4.48% across six multimodal graph datasets, significantly outperforming existing methods, with particularly notable improvements for clients suffering from missing modalities.
π Abstract
Multimodal federated graph learning (MM-FGL) aims to collaboratively learn from decentralized graphs with text and images. However, real-world clients may not share a common modality basis: a visual-search client may contain image--interaction graphs but no seller descriptions, while a catalog client may provide text but no product images. We refer to this practical setting as client-level modality deficiency. Unlike random instance-wise missingness, a deficient client lacks the local semantic basis needed to reconstruct the absent modality. More importantly, in graph learning, incomplete representations initialize message passing, so imputation errors can be filtered, mixed, and amplified by the receiving topology. To address this gap, we propose \textbf{PRISM} (\textbf{P}roactive \textbf{R}etrieval and \textbf{I}mputation via \textbf{S}tructural \textbf{M}eta-prompting), a topology-aware federated cross-modal imputation framework. Rather than reconstructing the missing modality solely from local observations, PRISM recovers missing-modality semantics from the federation and introduces them into local graph propagation under topology-aware control. Experiments on six multimodal graph datasets across graph-centric and modality-centric tasks show that PRISM consistently improves modality-deficient clients, outperforming state-of-the-art baselines by \textbf{4.48}\% on average.