🤖 AI Summary
This work addresses structural distortion in spanning tree generation for unweighted networks. Traditional Prim’s and Kruskal’s algorithms lack theoretical justification on unweighted graphs, while DFS often yields highly imbalanced, deep trees. We systematically evaluate the applicability of classical algorithms and propose a novel BFS-based spanning tree construction framework that inherently preserves shortest-path distances between nodes and network diameter, while yielding an approximately power-law degree distribution—balancing topological fidelity and tree balance. Extensive experiments across over one thousand real-world and synthetic networks demonstrate that BFS-generated spanning trees significantly outperform baseline methods in both distance preservation and compactness. Our approach provides a theoretically grounded, computationally efficient tool for network backbone extraction, simplified sampling, and structural analysis.
📝 Abstract
Spanning tree of a network or a graph is a subgraph connecting all the nodes with the minimum number of edges. Spanning tree retains the connectivity of a network and possibly other structural properties, and is one of the simplest techniques for network simplification or sampling, and for revealing its backbone or skeleton. The Prim's algorithm and the Kruskal's algorithm are well known algorithms for computing a spanning tree of a weighted network. In this paper, we study the performance of these algorithms on unweighted networks, and compare them to different priority-first search algorithms. We show that the distances between the nodes and the diameter of a network are best preserved by an algorithm based on the breadth-first search node traversal. The algorithm computes a spanning tree with properties of a balanced tree and a power-law node degree distribution. We support our results by experiments on synthetic graphs and more than a thousand real networks, and demonstrate different practical applications of computed spanning trees. We conclude that, if a spanning tree is supposed to retain the distances between the nodes or the diameter of an unweighted network, then the breadth-first search algorithm should be the preferred choice.