π€ AI Summary
In logit-based federated learning, a semi-honest server can exploit shared logits to steal clientsβ private models, posing a severe privacy threat. This work is the first to theoretically and empirically quantify such model leakage risks, demonstrating that effective attacks remain feasible even using only unrelated public data. To address this vulnerability, we propose AdaMSA, an adaptive model stealing attack, and introduce a targeted logits perturbation defense mechanism that significantly mitigates privacy leakage with negligible impact on training performance. Our approach achieves a synergistic balance between privacy preservation and model utility, thereby filling a critical gap in understanding and mitigating model-level privacy risks in logit-based federated learning.
π Abstract
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among clients. Unlike traditional parameter-based FL methods that exchange model weights or gradients during training, emerging logit-based FL approaches share model outputs (logits) on public data. This strategy promotes model heterogeneity, reduces communication overhead, and enhances clients' privacy. However, the potential privacy risks associated with these logit-based methods have been largely overlooked. This research presents the first theoretical and empirical analysis of a hidden privacy risk in logit-based FL methods - the risk that a semi-honest server (adversary) may learn clients' private models from logits. To quantify and address this threat, we develop the Adaptive Model Stealing Attack (AdaMSA) by leveraging historical logits during training. Notably, we observe that this inherent privacy risk persists even when public data is unrelated to private data, emphasizing the urgency to address privacy vulnerabilities in logit-based FL methods. Moreover, our theoretical analysis establishes the bounds of this privacy risk. We then propose a simple but effective defense strategy that perturbs the transmitted logits in the direction that minimizes the privacy risk while maximally preserving the training performance. The experimental results validate our analysis and demonstrate the effectiveness of AdaMSA and our defense strategy.