THESAURUS: Contrastive Graph Clustering by Swapping Fused Gromov-Wasserstein Couplings

📅 2024-12-16
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing unsupervised graph node clustering methods rely heavily on K-means, making them vulnerable to poor cluster separability in encoder representations—leading to the uniformity effect and cluster assimilation. These issues stem from contextual information deficiency, misalignment between pretraining and downstream clustering objectives, and insufficient exploitation of graph-structural clustering cues. Method: We propose ProtoGW: (i) semantic prototypes to enhance node discriminability; (ii) a cross-view assignment prediction task to explicitly align representation learning with clustering goals; and (iii) momentum-updated Gromov–Wasserstein (GW) optimal transport to model inter-cluster geometric consistency within the graph structure. We further introduce a novel coupled exchange mechanism between prototype graphs and momentum GW. Contribution/Results: ProtoGW achieves significant improvements over state-of-the-art methods across multiple benchmarks, effectively mitigating the uniformity effect and cluster assimilation while substantially enhancing cluster separability.

Technology Category

Application Category

📝 Abstract
Graph node clustering is a fundamental unsupervised task. Existing methods typically train an encoder through selfsupervised learning and then apply K-means to the encoder output. Some methods use this clustering result directly as the final assignment, while others initialize centroids based on this initial clustering and then finetune both the encoder and these learnable centroids. However, due to their reliance on K-means, these methods inherit its drawbacks when the cluster separability of encoder output is low, facing challenges from the Uniform Effect and Cluster Assimilation. We summarize three reasons for the low cluster separability in existing methods: (1) lack of contextual information prevents discrimination between similar nodes from different clusters; (2) training tasks are not sufficiently aligned with the downstream clustering task; (3) the cluster information in the graph structure is not appropriately exploited. To address these issues, we propose conTrastive grapH clustEring by SwApping fUsed gRomov-wasserstein coUplingS (THESAURUS). Our method introduces semantic prototypes to provide contextual information, and employs a cross-view assignment prediction pretext task that aligns well with the downstream clustering task. Additionally, it utilizes Gromov-Wasserstein Optimal Transport (GW-OT) along with the proposed prototype graph to thoroughly exploit cluster information in the graph structure. To adapt to diverse real-world data, THESAURUS updates the prototype graph and the prototype marginal distribution in OT by using momentum. Extensive experiments demonstrate that THESAURUS achieves higher cluster separability than the prior art, effectively mitigating the Uniform Effect and Cluster Assimilation issues
Problem

Research questions and friction points this paper is trying to address.

Enhance graph node clustering separability
Address Uniform Effect and Cluster Assimilation
Exploit contextual and graph structure information
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces semantic prototypes for context
Uses Gromov-Wasserstein Optimal Transport
Updates prototype graph with momentum
🔎 Similar Papers
No similar papers found.
Bowen Deng
Bowen Deng
Postdoc at MIT | PhD at UC Berkeley
Machine LearningAI for ScienceComputational MaterialsEnergy Materials
T
Tong Wang
School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China; School of Systems Science and Engineering, Sun Yat-sen University, Guangzhou, China
Lele Fu
Lele Fu
School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China; School of Systems Science and Engineering, Sun Yat-sen University, Guangzhou, China
S
Sheng Huang
School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China; School of Systems Science and Engineering, Sun Yat-sen University, Guangzhou, China
Chuan Chen
Chuan Chen
University of Wisconsin, Madison
Applied Microeconomics
T
Tao Zhang
School of Systems Science and Engineering, Sun Yat-sen University, Guangzhou, China