Hierarchical Vector Quantized Graph Autoencoder with Annealing-Based Code Selection

📅 2025-04-17

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing graph self-supervised learning methods overly rely on structural perturbations, which often degrade graph integrity; meanwhile, VQ-VAEs applied to graph data suffer from two key limitations: uneven codebook utilization and spatial sparsity. To address these issues, we propose the Hierarchical Vector Quantized Graph Autoencoder (HVQ-GAE). Our method introduces an annealing-based codebook selection mechanism to mitigate codebook utilization bias and designs a dual-level hierarchical codebook—where the bottom level captures node semantic similarity and the top level encodes topological structural dependencies—enabling joint optimization of graph representations. This work is the first to integrate annealing-driven vector quantization and hierarchical codebooks into graph self-supervised learning. Extensive experiments demonstrate that HVQ-GAE significantly outperforms 16 state-of-the-art baselines across multiple benchmark datasets, achieving new state-of-the-art performance on both link prediction and node classification tasks.

Technology Category

Application Category

📝 Abstract

Graph self-supervised learning has gained significant attention recently. However, many existing approaches heavily depend on perturbations, and inappropriate perturbations may corrupt the graph's inherent information. The Vector Quantized Variational Autoencoder (VQ-VAE) is a powerful autoencoder extensively used in fields such as computer vision; however, its application to graph data remains underexplored. In this paper, we provide an empirical analysis of vector quantization in the context of graph autoencoders, demonstrating its significant enhancement of the model's capacity to capture graph topology. Furthermore, we identify two key challenges associated with vector quantization when applying in graph data: codebook underutilization and codebook space sparsity. For the first challenge, we propose an annealing-based encoding strategy that promotes broad code utilization in the early stages of training, gradually shifting focus toward the most effective codes as training progresses. For the second challenge, we introduce a hierarchical two-layer codebook that captures relationships between embeddings through clustering. The second layer codebook links similar codes, encouraging the model to learn closer embeddings for nodes with similar features and structural topology in the graph. Our proposed model outperforms 16 representative baseline methods in self-supervised link prediction and node classification tasks across multiple datasets.

Problem

Research questions and friction points this paper is trying to address.

Enhancing graph topology capture via vector quantization

Addressing codebook underutilization in graph data

Mitigating codebook space sparsity hierarchically

Innovation

Methods, ideas, or system contributions that make the work stand out.

Annealing-based encoding strategy for code utilization

Hierarchical two-layer codebook for embedding relationships

Vector quantization enhances graph topology capture

🔎 Similar Papers

No similar papers found.

Authors to Follow