๐ค AI Summary
This work addresses the lack of theoretical robustness guarantees for the MultiKrum aggregation rule under Byzantine threat models. It presents the first rigorous robustness analysis of MultiKrum, introducing an optimal robustness coefficient ฮบ* to more precisely characterize its estimation error with respect to the global mean. The authors derive tight upper and lower bounds for ฮบ*, combining techniques from robust statistical estimation with theoretical analysis of aggregation rules. Experimental validation demonstrates that MultiKrum exhibits robustness in practical settings that is either superior or at least no worse than that of Krum. Furthermore, the derived theoretical lower bound on ฮบ* is shown to be practically meaningful, confirming the relevance of the established guarantees in real-world applications.
๐ Abstract
Aggregation rules are the cornerstone of distributed (or federated) learning in the presence of adversaries, under the so-called Byzantine threat model. They are also interesting mathematical objects from the point of view of robust mean estimation. The Krum aggregation rule has been extensively studied, and endowed with formal robustness and convergence guarantees. Yet, MultiKrum, a natural extension of Krum, is often preferred in practice for its superior empirical performance, even though no theoretical guarantees were available until now. In this work, we provide the first proof that MultiKrum is a robust aggregation rule, and bound its robustness coefficient. To do so, we introduce $\kappa^\star$, the optimal *robustness coefficient* of an aggregation rule, which quantifies the accuracy of mean estimation in the presence of adversaries in a tighter manner compared with previously adopted notions of robustness. We then construct an upper and a lower bound on MultiKrum's robustness coefficient. As a by-product, we also improve on the best-known bounds on Krum's robustness coefficient. We show that MultiKrum's bounds are never worse than Krum's, and better in realistic regimes. We illustrate this analysis by an experimental investigation on the quality of the lower bound.