Multimodal Graph Negative Learning

📅 2026-06-10

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the node-level semantic imbalance in multimodal attributed graphs, where modalities exhibit varying informativeness and reliability across nodes, leading to bias propagation and suppression of useful semantics when enforcing strict alignment. To mitigate this, the authors propose GraphMNL, a novel framework that abandons conventional inter-branch mimicry and instead introduces a graph-aware negative learning mechanism. This approach guides weaker branches to identify non-target classes rather than replicate dominant ones, while integrating target-preserving strategies, graph-aware reliability arbitration, and an instability-aware transfer gating module. These components collectively prevent bias propagation and preserve the intrinsic discriminative semantics of each modality. Experiments demonstrate that GraphMNL achieves state-of-the-art performance, attaining 72.47% accuracy on the Grocery dataset and 76.60 F1-score on Reddit-M, significantly outperforming existing methods.

📝 Abstract

Multimodal attributed graphs (MAGs) integrate graph topology with heterogeneous modality attributes, such as text and images, thereby enabling richer modeling of complex relational systems. However, such expressiveness also makes learning on MAGs depend on multiple semantic sources, including structural topology, textual and visual attributes, each of which can be regarded as a branch for node representation. Node-level branch semantic imbalance arises when these branches differ across nodes in semantic informativeness and reliability: a branch that provides discriminative semantics for one node may mislead another due to bias in modality quality or structural context. Existing methods often mitigate such heterogeneity through cross-branch agreement or alignment, implicitly treating the dominant prediction as reliable supervision. When the dominant branch is biased, forced imitation may propagate its bias to other branches and suppress original semantics that are useful for classification. We propose GraphMNL, a graph-aware multimodal negative learning framework that addresses this issue by using Negative Learning as cross-branch guidance. Instead of forcing inferior branches to imitate a teacher prediction, the model teaches them which classes a node is unlikely to belong to. GraphMNL builds a branch library, identifies dominant and inferior branches via graph-aware reliability arbitration, gates unstable transfer, and applies target-preserving negative learning over non-target classes. This design decouples target supervision from branch guidance so that supervised losses learn the correct class, while Negative Learning suppresses unlikely alternatives when branch agreement is unreliable. Through the comprehensive experimental evaluation, GraphMNL achieves the best performance on Grocery datasets with 72.47% accuracy and 76.60 F1 score on Reddit M datasets.

Problem

Research questions and friction points this paper is trying to address.

Multimodal attributed graphs

Node-level semantic imbalance

Cross-branch bias propagation

Modality reliability

Negative learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal Graph Learning

Negative Learning

Semantic Imbalance