PE-means: Improved Differentially Private $k$-means Clustering through Private Evolution

📅 2026-05-29

📈 Citations: 0

✨ Influential: 0

career value

264K/year

🤖 AI Summary

This work addresses the challenge in differentially private k-means clustering in Euclidean space, where directly aggregating private data leads to sensitivity that scales linearly with the data domain size. To overcome this limitation, the paper introduces, for the first time, the Private Evolution (PE) framework into k-means clustering. By constructing a private histogram with constant sensitivity to guide the clustering evolution and designing task-specific evolutionary operators tailored for clustering, the method effectively circumvents high-sensitivity data aggregation. Under rigorous differential privacy guarantees, the proposed approach significantly improves clustering quality, achieving an average 20% reduction in clustering loss compared to the current state-of-the-art baseline methods.

📝 Abstract

We study the problem of differentially private (DP) $k$-means clustering in Euclidean space. Previous solutions rely on summing the private data directly, which induces a sensitivity proportional to the domain. We introduce PE-means, an extension of the private evolution (PE) algorithm (an increasingly popular method for synthetic data generation), to the problem of $k$-means clustering. The key advantage of PE is that it only computes a private histogram with constant sensitivity to guide the evolution. Our adaptation of PE includes new evolutionary operators for clustering, as well as other algorithmic improvements of independent interest. Overall, PE-means achieves an average improvement of 20% in clustering loss over state-of-the-art baselines.

Problem

Research questions and friction points this paper is trying to address.

differentially private

k-means clustering

private evolution

sensitivity

Euclidean space

Innovation

Methods, ideas, or system contributions that make the work stand out.

differentially private clustering

private evolution

k-means