An effective variant of the Hartigan $k$-means algorithm

📅 2026-04-23

📈 Citations: 0

✨ Influential: 0

career value

264K/year

🤖 AI Summary

This work addresses the limited clustering quality of the Lloyd algorithm in many practical scenarios by proposing a lightweight enhancement based on Hartigan’s k-means. The method refines the point-wise reallocation strategy and local search mechanism to further improve clustering performance without substantially increasing computational overhead. Experimental results demonstrate that, compared to the original Hartigan algorithm, the proposed approach consistently achieves a 2%–5% improvement in clustering quality on high-dimensional data or with a large number of clusters. Moreover, the performance gain becomes increasingly pronounced as either the data dimensionality or the number of clusters grows.

Technology Category

Application Category

📝 Abstract

The k-means problem is perhaps the classical clustering problem and often synonymous with Lloyd's algorithm (1957). It has become clear that Hartigan's algorithm (1975) gives better results in almost all cases, Telgarsky-Vattani note a typical improvement of $5\%$ -- $10\%$. We point out that a very minor variation of Hartigan's method leads to another $2\%$ -- $5\%$ improvement; the improvement tends to become larger when either dimension or $k$ increase.

Problem

Research questions and friction points this paper is trying to address.

k-means

clustering

Hartigan algorithm

Lloyd's algorithm

optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hartigan k-means

clustering algorithm

algorithmic improvement