๐ค AI Summary
To address the inefficiency and difficulty in identifying overlapping communities in large-scale weighted directed networks, this paper proposes CoDeSEG, a heuristic algorithm based on two-dimensional structural entropyๅๅผ. Methodologically, it pioneers the integration of structural entropy into a potential game framework to model community evolution, designing a local strategy-updating mechanism and a structural-entropy-driven node-overlap determination rule. Theoretically, CoDeSEG guarantees near-linear time complexity, overcoming traditional methodsโ reliance on undirected or unweighted graphs. Extensive experiments on multiple real-world heterogeneous networks demonstrate that CoDeSEG achieves the fastest runtime while attaining state-of-the-art performance in overlapping normalized mutual information (ONMI) and F1-score. It thus significantly advances efficient, accurate, and scalable detection of overlapping communities in ultra-large-scale complex networks.
๐ Abstract
Community detection is a critical task in graph theory, social network analysis, and bioinformatics, where communities are defined as clusters of densely interconnected nodes. However, detecting communities in large-scale networks with millions of nodes and billions of edges remains challenging due to the inefficiency and unreliability of existing methods. Moreover, many current approaches are limited to specific graph types, such as unweighted or undirected graphs, reducing their broader applicability. To address these issues, we propose a novel heuristic community detection algorithm, termed CoDeSEG, which identifies communities by minimizing the two-dimensional (2D) structural entropy of the network within a potential game framework. In the game, nodes decide to stay in current community or move to another based on a strategy that maximizes the 2D structural entropy utility function. Additionally, we introduce a structural entropy-based node overlapping heuristic for detecting overlapping communities, with a near-linear time complexity.Experimental results on real-world networks demonstrate that CoDeSEG is the fastest method available and achieves state-of-the-art performance in overlapping normalized mutual information (ONMI) and F1 score.