DAM-GT: Dual Positional Encoding-Based Attention Masking Graph Transformer for Node Classification

📅 2025-05-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing neighborhood-aware graph Transformers face two key challenges in node classification: insufficient modeling of attribute correlations within neighborhoods, and attention distortion caused by distant-hop neighbors interfering with target-node focus. To address these, we propose dual positional encoding—integrating topological distance and attribute-clustering similarity—to explicitly capture structure-attribute co-dependencies within neighborhoods; and a target-guided attention mask that suppresses high-hop neighbors and enforces fine-grained interaction exclusively between the target node and its immediate (1-hop) neighbors. Our method is fully compatible with standard Transformer architectures and requires no modification to the model backbone. Extensive evaluation on multiple homogeneous graph benchmarks demonstrates consistent and significant improvements over state-of-the-art methods, achieving substantial gains in average classification accuracy. These results validate the effectiveness of jointly optimizing neighborhood semantic modeling and attention alignment.

Technology Category

Application Category

📝 Abstract
Neighborhood-aware tokenized graph Transformers have recently shown great potential for node classification tasks. Despite their effectiveness, our in-depth analysis of neighborhood tokens reveals two critical limitations in the existing paradigm. First, current neighborhood token generation methods fail to adequately capture attribute correlations within a neighborhood. Second, the conventional self-attention mechanism suffers from attention diversion when processing neighborhood tokens, where high-hop neighborhoods receive disproportionate focus, severely disrupting information interactions between the target node and its neighborhood tokens. To address these challenges, we propose DAM-GT, Dual positional encoding-based Attention Masking graph Transformer. DAM-GT introduces a novel dual positional encoding scheme that incorporates attribute-aware encoding via an attribute clustering strategy, effectively preserving node correlations in both topological and attribute spaces. In addition, DAM-GT formulates a new attention mechanism with a simple yet effective masking strategy to guide interactions between target nodes and their neighborhood tokens, overcoming the issue of attention diversion. Extensive experiments on various graphs with different homophily levels as well as different scales demonstrate that DAM-GT consistently outperforms state-of-the-art methods in node classification tasks.
Problem

Research questions and friction points this paper is trying to address.

Inadequate capture of neighborhood attribute correlations
Attention diversion in self-attention mechanisms
Disrupted target-neighborhood information interactions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual positional encoding for attribute clustering
Attention masking to prevent diversion
Neighborhood-aware graph Transformer model
🔎 Similar Papers
No similar papers found.
C
Chenyang Li
School of Computer Science and Technology, Huazhong University of Science and Technology; Hopcroft Center on Computing Science, Huazhong University of Science and Technology
Jinsong Chen
Jinsong Chen
Central China Normal University
Graph Representation LearningGraph Data MiningAI for Education
J
J. Hopcroft
Hopcroft Center on Computing Science, Huazhong University of Science and Technology; Department of Computer Science, Cornell University
K
Kun He
School of Computer Science and Technology, Huazhong University of Science and Technology; Hopcroft Center on Computing Science, Huazhong University of Science and Technology