🤖 AI Summary
In federated learning (FL), mainstream privacy techniques—differential privacy, homomorphic encryption, and secure multi-party computation—suffer from high communication/computation overhead and security limitations under the semi-honest threat model, particularly when handling nonlinear functions and large-scale matrix multiplication.
Method: We propose an approximate coded-computation framework based on Berrut rational interpolation, the first to integrate this interpolation scheme into a Shamir-type secret sharing mechanism. This enables privacy-preserving nonlinear operations and distributed matrix multiplication without exposing local data.
Contribution/Results: Our approach provides strong input privacy, low communication complexity, model-agnosticism, and compatibility with heterogeneous devices. We formally prove its privacy guarantees and empirically demonstrate controllable accuracy and significantly lower overhead compared to state-of-the-art methods. In large-scale FL settings, it achieves a superior privacy–accuracy trade-off.
📝 Abstract
Federated Learning (FL) is an interesting strategy that enables the collaborative training of an AI model among different data owners without revealing their private datasets. Even so, FL has some privacy vulnerabilities that have been tried to be overcome by applying some techniques like Differential Privacy (DP), Homomorphic Encryption (HE), or Secure Multi-Party Computation (SMPC). However, these techniques have some important drawbacks that might narrow their range of application: problems to work with non-linear functions and to operate large matrix multiplications and high communication and computational costs to manage semi-honest nodes. In this context, we propose a solution to guarantee privacy in FL schemes that simultaneously solves the previously mentioned problems. Our proposal is based on the Berrut Approximated Coded Computing, a technique from the Coded Distributed Computing paradigm, adapted to a Secret Sharing configuration, to provide input privacy to FL in a scalable way. It can be applied for computing non-linear functions and treats the special case of distributed matrix multiplication, a key primitive at the core of many automated learning tasks. Because of these characteristics, it could be applied in a wide range of FL scenarios, since it is independent of the machine learning models or aggregation algorithms used in the FL scheme. We provide analysis of the achieved privacy and complexity of our solution and, due to the extensive numerical results performed, a good trade-off between privacy and precision can be observed.