π€ AI Summary
Traditional clustering methods (e.g., k-means, DBSCAN, HDBSCAN) suffer from degraded performance and weak outlier detection in high-dimensional data due to the curse of dimensionality. To address this, we propose DelTriCβa novel framework that decouples neighborhood construction from clustering decision. Its core innovation lies in first reducing dimensionality via PCA or UMAP, then constructing a Delaunay triangulation-based neighborhood graph in the low-dimensional space; subsequently, a reverse-projection mechanism maps this graph back to the original high-dimensional space to enable robust edge pruning, connected-component merging, and outlier identification. This design effectively mitigates dimensional distortion of neighborhood relationships, significantly improving both clustering accuracy and outlier detection robustness. Extensive experiments on multiple high-dimensional benchmark datasets demonstrate that DelTriC consistently outperforms state-of-the-art baselines, while maintaining scalability and practical applicability.
π Abstract
The paper introduces DelTriC (Delaunay Triangulation Clustering), a clustering algorithm which integrates PCA/UMAP-based projection, Delaunay triangulation, and a novel back-projection mechanism to form clusters in the original high-dimensional space. DelTriC decouples neighborhood construction from decision-making by first triangulating in a low-dimensional proxy to index local adjacency, and then back-projecting to the original space to perform robust edge pruning, merging, and anomaly detection. DelTriC can outperform traditional methods such as k-means, DBSCAN, and HDBSCAN in many scenarios; it is both scalable and accurate, and it also significantly improves outlier detection.