Deep Cut-informed Graph Embedding and Clustering

📅 2025-03-09

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

Existing GNN-based graph clustering methods suffer from representation collapse due to inductive bias and clustering loss, leading to node representation homogenization and label bias. Method: We propose DCGC—a non-GNN framework grounded in graph cut theory—that introduces a cut-aware embedding objective jointly modeling structural and attribute information; abandons GNNs to avoid over-smoothing of neighborhood representations; and leverages optimal transport for self-supervised cluster assignment, mitigating center-oriented degeneration. DCGC integrates normalized joint-cut minimization, optimal transport theory, and joint structural-attribute encoding. Contribution/Results: Evaluated on multiple benchmark datasets, DCGC significantly outperforms GNN-based baselines, effectively suppressing representation collapse while improving clustering accuracy and robustness.

Technology Category

Application Category

📝 Abstract

Graph clustering aims to divide the graph into different clusters. The recently emerging deep graph clustering approaches are largely built on graph neural networks (GNN). However, GNN is designed for general graph encoding and there is a common issue of representation collapse in existing GNN-based deep graph clustering algorithms. We attribute two main reasons for such issue: (i) the inductive bias of GNN models: GNNs tend to generate similar representations for proximal nodes. Since graphs often contain a non-negligible amount of inter-cluster links, the bias results in error message passing and leads to biased clustering; (ii) the clustering guided loss function: most traditional approaches strive to make all samples closer to pre-learned cluster centers, which cause a degenerate solution assigning all data points to a single label thus make all samples and less discriminative. To address these challenges, we investigate graph clustering from a graph cut perspective and propose an innovative and non-GNN-based Deep Cut-informed Graph embedding and Clustering framework, namely DCGC. This framework includes two modules: (i) cut-informed graph encoding; (ii) self-supervised graph clustering via optimal transport. For the encoding module, we derive a cut-informed graph embedding objective to fuse graph structure and attributes by minimizing their joint normalized cut. For the clustering module, we utilize the optimal transport theory to obtain the clustering assignments, which can balance the guidance of proximity to the pre-learned cluster center. With the above two tailored designs, DCGC is more suitable for the graph clustering task, which can effectively alleviate the problem of representation collapse and achieve better performance. We conduct extensive experiments to demonstrate that our method is simple but effective compared with benchmarks.

Problem

Research questions and friction points this paper is trying to address.

Addresses representation collapse in GNN-based graph clustering.

Proposes a non-GNN framework for improved graph clustering.

Utilizes cut-informed encoding and optimal transport for clustering.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cut-informed graph encoding minimizes joint normalized cut

Self-supervised clustering uses optimal transport theory

Non-GNN-based DCGC framework prevents representation collapse

🔎 Similar Papers

Refined Graph Encoder Embedding via Self-Training and Latent Community Recovery