🤖 AI Summary
To address the poor scalability and high latency of dense subgraph discovery (DSD) methods for fraud detection in large-scale financial and e-commerce transaction graphs, this paper proposes the first distributed DSD engine leveraging the intrinsic parallelism of the peeling process. Our approach integrates parallel graph computation, density-aware peeling optimization, and adaptive iterative scheduling, ensuring theoretical guarantees on solution quality while enabling efficient, scalable computation. It supports multi-objective customization and scenario-aware adaptation, with a clean, high-level API abstraction. Evaluated on billion-scale real-world transaction graphs, our engine achieves up to 100× higher throughput than conventional sequential baselines; fraud detection accuracy improves from 45% to 94.5%, and subgraph density estimation error decreases from 30% to under 5%.
📝 Abstract
Detecting fraudulent activities in financial and e-commerce transaction networks is crucial. One effective method for this is Densest Subgraph Discovery (DSD). However, deploying DSD methods in production systems faces substantial scalability challenges due to the predominantly sequential nature of existing methods, which impedes their ability to handle large-scale transaction networks and results in significant detection delays. To address these challenges, we introduce Dupin, a novel parallel processing framework designed for efficient DSD processing in billion-scale graphs. Dupin is powered by a processing engine that exploits the unique properties of the peeling process, with theoretical guarantees on detection quality and efficiency. Dupin provides userfriendly APIs for flexible customization of DSD objectives and ensures robust adaptability to diverse fraud detection scenarios. Empirical evaluations demonstrate that Dupin consistently outperforms several existing DSD methods, achieving performance improvements of up to 100 times compared to traditional approaches. On billion-scale graphs, Dupin demonstrates the potential to enhance the prevention of fraudulent transactions from 45% to 94.5% and reduces density error from 30% to below 5%, as supported by our experimental results. These findings highlight the effectiveness of Dupin in real-world applications, ensuring both speed and accuracy in fraud detection.