An effective variant of the Hartigan $k$-means algorithm

📅 2026-04-23
📈 Citations: 0
Influential: 0
📄 PDF

career value

264K/year
🤖 AI Summary
This work addresses the limited clustering quality of the Lloyd algorithm in many practical scenarios by proposing a lightweight enhancement based on Hartigan’s k-means. The method refines the point-wise reallocation strategy and local search mechanism to further improve clustering performance without substantially increasing computational overhead. Experimental results demonstrate that, compared to the original Hartigan algorithm, the proposed approach consistently achieves a 2%–5% improvement in clustering quality on high-dimensional data or with a large number of clusters. Moreover, the performance gain becomes increasingly pronounced as either the data dimensionality or the number of clusters grows.

Technology Category

Application Category

📝 Abstract
The k-means problem is perhaps the classical clustering problem and often synonymous with Lloyd's algorithm (1957). It has become clear that Hartigan's algorithm (1975) gives better results in almost all cases, Telgarsky-Vattani note a typical improvement of $5\%$ -- $10\%$. We point out that a very minor variation of Hartigan's method leads to another $2\%$ -- $5\%$ improvement; the improvement tends to become larger when either dimension or $k$ increase.
Problem

Research questions and friction points this paper is trying to address.

k-means
clustering
Hartigan algorithm
Lloyd's algorithm
optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hartigan k-means
clustering algorithm
algorithmic improvement
k-means optimization
unsupervised learning