🤖 AI Summary
To address the three key challenges in privacy-preserving k-means clustering for horizontal federated learning—high computational overhead, weak privacy guarantees, and significant utility degradation—this paper proposes the first lightweight secure aggregation framework grounded in computational differential privacy (Computational DP). Our method integrates secure multi-party computation with a novel DP-aware clustering mechanism to enable efficient, collaborative clustering under strong privacy protection. Compared to state-of-the-art approaches, it achieves a 10⁵× speedup in training time, while matching or even surpassing the clustering utility of centralized DP baselines. Moreover, it supports fine-grained privacy–accuracy trade-offs. The core innovation lies in introducing Computational DP theory into federated clustering, thereby overcoming the utility limitations inherent in conventional randomization-based mechanisms. This work establishes a new paradigm for high-privacy, high-utility federated unsupervised learning.
📝 Abstract
We study the problem of privacy-preserving $k$-means clustering in the horizontally federated setting. Existing federated approaches using secure computation suffer from substantial overheads and do not offer output privacy. At the same time, differentially private (DP) $k$-means algorithms either assume a trusted central curator or significantly degrade utility by adding noise in the local DP model. Naively combining the secure and central DP solutions results in a protocol with impractical overhead. Instead, our work provides enhancements to both the DP and secure computation components, resulting in a design that is faster, more private, and more accurate than previous work. By utilizing the computational DP model, we design a lightweight, secure aggregation-based approach that achieves five orders of magnitude speed-up over state-of-the-art related work. Furthermore, we not only maintain the utility of the state-of-the-art in the central model of DP, but we improve the utility further by designing a new DP clustering mechanism.