Dupin: A Parallel Framework for Densest Subgraph Discovery in Fraud Detection on Massive Graphs (Technical Report)

📅 2025-04-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the poor scalability and high latency of dense subgraph discovery (DSD) methods for fraud detection in large-scale financial and e-commerce transaction graphs, this paper proposes the first distributed DSD engine leveraging the intrinsic parallelism of the peeling process. Our approach integrates parallel graph computation, density-aware peeling optimization, and adaptive iterative scheduling, ensuring theoretical guarantees on solution quality while enabling efficient, scalable computation. It supports multi-objective customization and scenario-aware adaptation, with a clean, high-level API abstraction. Evaluated on billion-scale real-world transaction graphs, our engine achieves up to 100× higher throughput than conventional sequential baselines; fraud detection accuracy improves from 45% to 94.5%, and subgraph density estimation error decreases from 30% to under 5%.

Technology Category

Application Category

📝 Abstract
Detecting fraudulent activities in financial and e-commerce transaction networks is crucial. One effective method for this is Densest Subgraph Discovery (DSD). However, deploying DSD methods in production systems faces substantial scalability challenges due to the predominantly sequential nature of existing methods, which impedes their ability to handle large-scale transaction networks and results in significant detection delays. To address these challenges, we introduce Dupin, a novel parallel processing framework designed for efficient DSD processing in billion-scale graphs. Dupin is powered by a processing engine that exploits the unique properties of the peeling process, with theoretical guarantees on detection quality and efficiency. Dupin provides userfriendly APIs for flexible customization of DSD objectives and ensures robust adaptability to diverse fraud detection scenarios. Empirical evaluations demonstrate that Dupin consistently outperforms several existing DSD methods, achieving performance improvements of up to 100 times compared to traditional approaches. On billion-scale graphs, Dupin demonstrates the potential to enhance the prevention of fraudulent transactions from 45% to 94.5% and reduces density error from 30% to below 5%, as supported by our experimental results. These findings highlight the effectiveness of Dupin in real-world applications, ensuring both speed and accuracy in fraud detection.
Problem

Research questions and friction points this paper is trying to address.

Scalability challenges in sequential densest subgraph discovery methods
Handling large-scale transaction networks with significant detection delays
Need for efficient parallel processing in billion-scale fraud detection graphs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallel framework for billion-scale graph processing
Peeling process engine with theoretical guarantees
User-friendly APIs for customizable DSD objectives
🔎 Similar Papers
No similar papers found.
J
Jiaxin Jiang
National University of Singapore, Singapore
Siyuan Yao
Siyuan Yao
University of Notre Dame
VisualizationComputer GraphicsComputer Vision
Y
Yuchen Li
Singapore Management University, Singapore
Q
Qiange Wang
National University of Singapore, Singapore
B
Bingsheng He
National University of Singapore, Singapore
M
Min Chen
GrabTaxi Holdings, Singapore