FastLloyd: Federated, Accurate, Secure, and Tunable k-Means Clustering with Differential Privacy

📅 2024-05-03
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the three key challenges in privacy-preserving k-means clustering for horizontal federated learning—high computational overhead, weak privacy guarantees, and significant utility degradation—this paper proposes the first lightweight secure aggregation framework grounded in computational differential privacy (Computational DP). Our method integrates secure multi-party computation with a novel DP-aware clustering mechanism to enable efficient, collaborative clustering under strong privacy protection. Compared to state-of-the-art approaches, it achieves a 10⁵× speedup in training time, while matching or even surpassing the clustering utility of centralized DP baselines. Moreover, it supports fine-grained privacy–accuracy trade-offs. The core innovation lies in introducing Computational DP theory into federated clustering, thereby overcoming the utility limitations inherent in conventional randomization-based mechanisms. This work establishes a new paradigm for high-privacy, high-utility federated unsupervised learning.

Technology Category

Application Category

📝 Abstract
We study the problem of privacy-preserving $k$-means clustering in the horizontally federated setting. Existing federated approaches using secure computation suffer from substantial overheads and do not offer output privacy. At the same time, differentially private (DP) $k$-means algorithms either assume a trusted central curator or significantly degrade utility by adding noise in the local DP model. Naively combining the secure and central DP solutions results in a protocol with impractical overhead. Instead, our work provides enhancements to both the DP and secure computation components, resulting in a design that is faster, more private, and more accurate than previous work. By utilizing the computational DP model, we design a lightweight, secure aggregation-based approach that achieves five orders of magnitude speed-up over state-of-the-art related work. Furthermore, we not only maintain the utility of the state-of-the-art in the central model of DP, but we improve the utility further by designing a new DP clustering mechanism.
Problem

Research questions and friction points this paper is trying to address.

Privacy-preserving federated k-means clustering with low overhead
Improving accuracy and speed in differential privacy k-means clustering
Enhancing secure computation for federated clustering without trusted curator
Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight secure aggregation for speed-up
Computational DP model enhances privacy
New DP clustering mechanism improves utility
🔎 Similar Papers
No similar papers found.