🤖 AI Summary
This work addresses correlation clustering on dynamic graphs under a node-streaming insertion model, where the algorithm accesses graph data exclusively via database queries and must incrementally update the clustering structure upon each node arrival. To this end, we propose SPARSE-PIVOT—a theoretically grounded, approximate clustering framework built upon sparse pivot selection, a dynamic graph query mechanism, and an efficient incremental update strategy. SPARSE-PIVOT achieves an amortized update time of $O_varepsilon(log^{O(1)} n)$ and a provable approximation ratio of $20+varepsilon$, substantially improving upon prior methods. Its theoretical analysis is rigorous, and its implementation is highly efficient. Extensive experiments on real-world dynamic networks demonstrate that SPARSE-PIVOT attains both higher clustering accuracy and lower latency compared to state-of-the-art approaches, while scaling effectively to large-scale evolving graphs for real-time clustering maintenance.
📝 Abstract
We present a new Correlation Clustering algorithm for a dynamic setting where nodes are added one at a time. In this model, proposed by Cohen-Addad, Lattanzi, Maggiori, and Parotsidis (ICML 2024), the algorithm uses database queries to access the input graph and updates the clustering as each new node is added. Our algorithm has the amortized update time of $O_ε(log^{O(1)}(n))$. Its approximation factor is $20+varepsilon$, which is a substantial improvement over the approximation factor of the algorithm by Cohen-Addad et al. We complement our theoretical findings by empirically evaluating the approximation guarantee of our algorithm. The results show that it outperforms the algorithm by Cohen-Addad et al.~in practice.