CoreSPECT: Enhancing Clustering Algorithms via an Interplay of Density and Geometry

📅 2025-07-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Most existing clustering algorithms rely exclusively on either density or geometric structure, neglecting their synergistic interaction. To address this limitation, we propose CoreSPECT—a novel framework that formally characterizes the interplay between density and geometry for the first time. CoreSPECT extends local density-geometry consistency to global cluster partitioning via core-space projection and neighborhood-graph-based multi-layer propagation. The method is theoretically interpretable, robust to noise, and requires no additional hyperparameter tuning. Evaluated on 15 cross-domain benchmark datasets, CoreSPECT improves the Adjusted Rand Index (ARI) of K-Means and Gaussian Mixture Models (GMM) by 40% and 14% on average, respectively, significantly outperforming state-of-the-art manifold-learning and density-based clustering methods. These results demonstrate CoreSPECT’s effectiveness, generality, and practical applicability.

Technology Category

Application Category

📝 Abstract
Density and geometry have long served as two of the fundamental guiding principles in clustering algorithm design, with algorithm usually focusing either on the density structure of the data (e.g., HDBSCAN and Density Peak Clustering) or the complexity of underlying geometry (e.g., manifold clustering algorithms). In this paper, we identify and formalize a recurring but often overlooked interaction between distribution and geometry and leverage this insight to design our clustering enhancement framework CoreSPECT (Core Space Projection-based Enhancement of Clustering Techniques). Our framework boosts the performance of simple algorithms like K-Means and GMM by applying them to strategically selected regions, then extending the partial partition to a complete partition for the dataset using a novel neighborhood graph based multi-layer propagation procedure. We apply our framework on 15 datasets from three different domains and obtain consistent and substantial gain in clustering accuracy for both K-Means and GMM. On average, our framework improves the ARI of K-Means by 40% and of GMM by 14%, often surpassing the performance of both manifold-based and recent density-based clustering algorithms. We further support our framework with initial theoretical guarantees, ablation to demonstrate the usefulness of the individual steps and with evidence of robustness to noise.
Problem

Research questions and friction points this paper is trying to address.

Enhancing clustering by combining density and geometry principles
Improving K-Means and GMM via strategic region selection
Boosting accuracy with multi-layer propagation on neighborhood graphs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages density-geometry interaction for clustering
Uses novel multi-layer graph propagation
Enhances K-Means and GMM performance significantly
🔎 Similar Papers
No similar papers found.
Chandra Sekhar Mukherjee
Chandra Sekhar Mukherjee
University of Southern California
Theoretical computer science
J
Joonyoung Bae
Thomas Lord Department of Computer Science, University of Southern California
J
Jiapeng Zhang
Thomas Lord Department of Computer Science, University of Southern California