Cluster Aware Graph Anomaly Detection

📅 2024-09-15

📈 Citations: 1

✨ Influential: 0

career value

204K/year

🤖 AI Summary

To address performance bottlenecks in unsupervised anomaly detection on multi-view heterogeneous graphs—stemming from high dimensionality, label scarcity, and overly restrictive structural assumptions—this paper proposes a cluster-aware pseudo-label augmentation framework. Methodologically, it innovatively couples spectral clustering theory with contrastive learning, designing a similarity-guided loss function to effectively mitigate pseudo-label bias; it further introduces soft membership assignment and adjacency matrix enhancement to jointly model multi-view structural and feature information. Theoretical analysis establishes the intrinsic consistency between graph spectral clustering and contrastive learning. Extensive experiments on Amazon and YelpChi demonstrate substantial improvements: +39.2% in AUPRC and +18.7% in AUROC over state-of-the-art methods, validating both efficacy and robustness.

Technology Category

Application Category

📝 Abstract

Graph anomaly detection has gained significant attention across various domains, particularly in critical applications like fraud detection in e-commerce platforms and insider threat detection in cybersecurity. Usually, these data are composed of multiple types (e.g., user information and transaction records for financial data), thus exhibiting view heterogeneity. However, in the era of big data, the heterogeneity of views and the lack of label information pose substantial challenges to traditional approaches. Existing unsupervised graph anomaly detection methods often struggle with high-dimensionality issues, rely on strong assumptions about graph structures or fail to handle complex multi-view graphs. To address these challenges, we propose a cluster aware multi-view graph anomaly detection method, called CARE. Our approach captures both local and global node affinities by augmenting the graph's adjacency matrix with the pseudo-label (i.e., soft membership assignments) without any strong assumption about the graph. To mitigate potential biases from the pseudo-label, we introduce a similarity-guided loss. Theoretically, we show that the proposed similarity-guided loss is a variant of contrastive learning loss, and we present how this loss alleviates the bias introduced by pseudo-label with the connection to graph spectral clustering. Experimental results on several datasets demonstrate the effectiveness and efficiency of our proposed framework. Specifically, CARE outperforms the second-best competitors by more than 39% on the Amazon dataset with respect to AUPRC and 18.7% on the YelpChi dataset with respect to AUROC. The code of our method is available at the GitHub link: https://github.com/zhenglecheng/CARE-demo.

Problem

Research questions and friction points this paper is trying to address.

Detects anomalies in multi-view graphs

Addresses high-dimensionality and label scarcity

Proposes a cluster-aware method with similarity-guided loss

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cluster aware anomaly detection

Multi-view graph augmentation

Similarity-guided loss function

🔎 Similar Papers

Graph Anomaly Detection in Time Series: A Survey