🤖 AI Summary
Graph Neural Networks (GNNs) exhibit limited generalization across domains and tasks, suffering from negative transfer, poor scalability, and high adaptation costs. To address these challenges, we propose Prompt-MoE—a prompt-driven Mixture-of-Experts framework for graph foundation models. Our method introduces a structure-aware dynamic routing mechanism and enforces expert diversity via a soft orthogonality constraint to prevent expert collapse. Crucially, it adopts a lightweight fine-tuning paradigm that optimizes only task-specific prompt vectors, enabling efficient cross-task and cross-domain adaptation. Extensive experiments across diverse pretraining settings and downstream tasks demonstrate that Prompt-MoE matches the performance of full-parameter fine-tuning while reducing trainable parameters by over 99%. It significantly outperforms existing GNN generalization approaches in both accuracy and efficiency, establishing a new state-of-the-art for adaptable graph representation learning.
📝 Abstract
Graph Neural Networks (GNNs) have demonstrated impressive performance on task-specific benchmarks, yet their ability to generalize across diverse domains and tasks remains limited. Existing approaches often struggle with negative transfer, scalability issues, and high adaptation costs. To address these challenges, we propose GMoPE (Graph Mixture of Prompt-Experts), a novel framework that seamlessly integrates the Mixture-of-Experts (MoE) architecture with prompt-based learning for graphs. GMoPE leverages expert-specific prompt vectors and structure-aware MoE routing to enable each expert to specialize in distinct subdomains and dynamically contribute to predictions. To promote diversity and prevent expert collapse, we introduce a soft orthogonality constraint across prompt vectors, encouraging expert specialization and facilitating a more balanced expert utilization. Additionally, we adopt a prompt-only fine-tuning strategy that significantly reduces spatiotemporal complexity during transfer. We validate GMoPE through extensive experiments under various pretraining strategies and multiple downstream tasks. Results show that GMoPE consistently outperforms state-of-the-art baselines and achieves performance comparable to full parameter fine-tuning-while requiring only a fraction of the adaptation overhead. Our work provides a principled and scalable framework for advancing generalizable and efficient graph foundation models.