🤖 AI Summary
Existing graph foundation models emphasize language-driven semantic unification while neglecting structural heterogeneity across domains. Method: We propose BooG, a novel framework introducing virtual supernodes—structured representation units constructed from anchor nodes and class information—and virtual edges for efficient neighborhood aggregation. BooG further incorporates a structure-aware graph neural network and a contrastive pretraining objective to explicitly model cross-domain structural alignment. Contribution/Results: Evaluated on diverse graph datasets and downstream tasks, BooG consistently outperforms state-of-the-art graph foundation models. Results demonstrate that structurally unified representations significantly enhance both expressive power and cross-domain generalization capability of graph representations, validating the critical role of structural coherence in foundation modeling for graphs.
📝 Abstract
Graph foundation models have recently attracted significant attention due to its strong generalizability. Although existing methods resort to language models to learn unified semantic representations across domains, they disregard the unique structural characteristics of graphs from different domains. To address the problem, in this paper, we boost graph foundation model from structural perspective and propose BooG. The model constructs virtual super nodes to unify structural characteristics of graph data from different domains. Specifically, the super nodes fuse the information of anchor nodes and class labels, where each anchor node captures the information of a node or a graph instance to be classified. Instead of using the raw graph structure, we connect super nodes to all nodes within their neighborhood by virtual edges. This new structure allows for effective information aggregation while unifying cross-domain structural characteristics. Additionally, we propose a novel pre-training objective based on contrastive learning, which learns more expressive representations for graph data and generalizes effectively to different domains and downstream tasks. Experimental results on various datasets and tasks demonstrate the superior performance of BooG. We provide our code and data here: https://anonymous.4open.science/r/BooG-EE42/.