On the Optimization of Methods for Establishing Well-Connected Communities

πŸ“… 2025-08-28
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing community detection methods often yield disconnected or weakly connected clusters on large-scale graphs, compromising interpretability and robustness; while improved algorithms such as Weakly Connected Components (WCC) and Connected Modularity (CM) enhance connectivity, their prohibitive computational overhead limits scalability. This paper introduces highly optimized, parallel WCC and CM algorithms implemented in HPE Chapel, enabling sub-minute connected community detection on graphs with over two billion edgesβ€”the first such achievement. The algorithms feature deep optimizations in inter-process communication and load balancing, and are fully integrated into the Arkouda/Arachne framework to leverage modern multi-core architectures. On a 128-core system, our approach performs end-to-end connected clustering on the full OpenAlex graph (2B edges), achieving order-of-magnitude speedup over prior methods. This breakthrough significantly advances the scalability frontier for high-quality, strongly connected community detection in massive graphs.

Technology Category

Application Category

πŸ“ Abstract
Community detection plays a central role in uncovering meso scale structures in networks. However, existing methods often suffer from disconnected or weakly connected clusters, undermining interpretability and robustness. Well-Connected Clusters (WCC) and Connectivity Modifier (CM) algorithms are post-processing techniques that improve the accuracy of many clustering methods. However, they are computationally prohibitive on massive graphs. In this work, we present optimized parallel implementations of WCC and CM using the HPE Chapel programming language. First, we design fast and efficient parallel algorithms that leverage Chapel's parallel constructs to achieve substantial performance improvements and scalability on modern multicore architectures. Second, we integrate this software into Arkouda/Arachne, an open-source, high-performance framework for large-scale graph analytics. Our implementations uniquely enable well-connected community detection on massive graphs with more than 2 billion edges, providing a practical solution for connectivity-preserving clustering at web scale. For example, our implementations of WCC and CM enable community detection of the over 2-billion edge Open-Alex dataset in minutes using 128 cores, a result infeasible to compute previously.
Problem

Research questions and friction points this paper is trying to address.

Optimizing parallel implementations for well-connected community detection
Addressing computational challenges of WCC and CM on massive graphs
Enabling scalable connectivity-preserving clustering for billion-edge networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallel WCC and CM algorithms using Chapel
Integration with Arkouda/Arachne framework
Enables billion-edge community detection in minutes
πŸ”Ž Similar Papers
No similar papers found.
M
Mohammad Dindoost
New Jersey Institute of Technology, Newark, NJ, USA
O
Oliver Alvarado Rodriguez
New Jersey Institute of Technology, Newark, NJ, USA
B
Bartosz Bryg
New Jersey Institute of Technology, Newark, NJ, USA
Minhyuk Park
Minhyuk Park
Graduate Student, University of Illinois Urbana-Champaign
G
George Chacko
University of Illinois Urbana-Champaign, Urbana, IL, USA
Tandy Warnow
Tandy Warnow
Grainger Distinguished Chair in Engineering, UIUC
Computer ScienceComputational BiologyPhylogeneticsMetagenomicsMultiple Sequence Alignment
David A. Bader
David A. Bader
Distinguished Professor, New Jersey Institute of Technology
data sciencehigh performance computingcybersecuritymassive-scale analyticscomputational genomics