Dupin: A Parallel Framework for Densest Subgraph Discovery in Fraud Detection on Massive Graphs (Technical Report)

📅 2025-04-12

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

To address the poor scalability and high latency of dense subgraph discovery (DSD) methods for fraud detection in large-scale financial and e-commerce transaction graphs, this paper proposes the first distributed DSD engine leveraging the intrinsic parallelism of the peeling process. Our approach integrates parallel graph computation, density-aware peeling optimization, and adaptive iterative scheduling, ensuring theoretical guarantees on solution quality while enabling efficient, scalable computation. It supports multi-objective customization and scenario-aware adaptation, with a clean, high-level API abstraction. Evaluated on billion-scale real-world transaction graphs, our engine achieves up to 100× higher throughput than conventional sequential baselines; fraud detection accuracy improves from 45% to 94.5%, and subgraph density estimation error decreases from 30% to under 5%.

Technology Category

Application Category

📝 Abstract

Detecting fraudulent activities in financial and e-commerce transaction networks is crucial. One effective method for this is Densest Subgraph Discovery (DSD). However, deploying DSD methods in production systems faces substantial scalability challenges due to the predominantly sequential nature of existing methods, which impedes their ability to handle large-scale transaction networks and results in significant detection delays. To address these challenges, we introduce Dupin, a novel parallel processing framework designed for efficient DSD processing in billion-scale graphs. Dupin is powered by a processing engine that exploits the unique properties of the peeling process, with theoretical guarantees on detection quality and efficiency. Dupin provides userfriendly APIs for flexible customization of DSD objectives and ensures robust adaptability to diverse fraud detection scenarios. Empirical evaluations demonstrate that Dupin consistently outperforms several existing DSD methods, achieving performance improvements of up to 100 times compared to traditional approaches. On billion-scale graphs, Dupin demonstrates the potential to enhance the prevention of fraudulent transactions from 45% to 94.5% and reduces density error from 30% to below 5%, as supported by our experimental results. These findings highlight the effectiveness of Dupin in real-world applications, ensuring both speed and accuracy in fraud detection.

Problem

Research questions and friction points this paper is trying to address.

Scalability challenges in sequential densest subgraph discovery methods

Handling large-scale transaction networks with significant detection delays

Need for efficient parallel processing in billion-scale fraud detection graphs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallel framework for billion-scale graph processing

Peeling process engine with theoretical guarantees

User-friendly APIs for customizable DSD objectives

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

Software Engineer