π€ AI Summary
To address three key challenges in unsupervised federated learning (UFL)βnon-i.i.d. data distribution, high communication/computation overhead, and sensitivity to channel noiseβthis paper proposes FedUHD, the first lightweight UFL framework built upon hyperdimensional computing (HDC). Methodologically, clients employ a k-NN clustering-based hypervector pruning mechanism to mitigate non-i.i.d. bias; the server adopts a weighted HDC aggregation strategy to enhance robustness and convergence; and the entire framework eschews deep neural networks, relying solely on hypervector operations for both training and inference. Experimental results demonstrate that FedUHD achieves an average 15.50% accuracy improvement over state-of-the-art methods, accelerates training by 173.6Γ, improves energy efficiency by 612.7Γ, reduces communication cost by 271Γ, and exhibits significantly greater robustness against diverse channel noise types.
π Abstract
Unsupervised federated learning (UFL) has gained attention as a privacy-preserving, decentralized machine learning approach that eliminates the need for labor-intensive data labeling. However, UFL faces several challenges in practical applications: (1) non-independent and identically distributed (non-iid) data distribution across devices, (2) expensive computational and communication costs at the edge, and (3) vulnerability to communication noise. Previous UFL approaches have relied on deep neural networks (NN), which introduce substantial overhead in both computation and communication. In this paper, we propose FedUHD, the first UFL framework based on Hyperdimensional Computing (HDC). HDC is a brain-inspired computing scheme with lightweight training and inference operations, much smaller model size, and robustness to communication noise. FedUHD introduces two novel HDC-based designs to improve UFL performance. On the client side, a kNN-based cluster hypervector removal method addresses non-iid data samples by eliminating detrimental outliers. On the server side, a weighted HDC aggregation technique balances the non-iid data distribution across clients. Our experiments demonstrate that FedUHD achieves up to 173.6x and 612.7x better speedup and energy efficiency, respectively, in training, up to 271x lower communication cost, and 15.50% higher accuracy on average across diverse settings, along with superior robustness to various types of noise compared to state-of-the-art NN-based UFL approaches.