H+: An Efficient Similarity-Aware Aggregation for Byzantine Resilient Federated Learning

📅 2025-09-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In federated learning, Byzantine attacks undermine server-side robust aggregation of local model updates; existing similarity-based defenses rely on clean reference data and fail when no trusted samples are available. To address this, we propose H+, the first similarity-aware aggregation method designed for the clean-data-free setting. H+ introduces a piecewise similarity detection function H, incorporating randomized low-dimensional parameter fragment comparisons (K comparisons of r-dimensional subvectors) and a replaceable reference vector generation mechanism, enabling efficient malicious client identification in 𝒪(KMr) time complexity. Experiments demonstrate that H+ achieves state-of-the-art performance across diverse Byzantine attack types—including sign-flipping, Gaussian, and adaptive attacks—and under high attack proportions (up to 50%). It delivers strong robustness, low computational overhead, and broad applicability across heterogeneous client settings and model architectures.

Technology Category

Application Category

📝 Abstract
Federated Learning (FL) enables decentralized model training without sharing raw data. However, it remains vulnerable to Byzantine attacks, which can compromise the aggregation of locally updated parameters at the central server. Similarity-aware aggregation has emerged as an effective strategy to mitigate such attacks by identifying and filtering out malicious clients based on similarity between client model parameters and those derived from clean data, i.e., data that is uncorrupted and trustworthy. However, existing methods adopt this strategy only in FL systems with clean data, making them inapplicable to settings where such data is unavailable. In this paper, we propose H+, a novel similarity-aware aggregation approach that not only outperforms existing methods in scenarios with clean data, but also extends applicability to FL systems without any clean data. Specifically, H+ randomly selects $r$-dimensional segments from the $p$-dimensional parameter vectors uploaded to the server and applies a similarity check function $H$ to compare each segment against a reference vector, preserving the most similar client vectors for aggregation. The reference vector is derived either from existing robust algorithms when clean data is unavailable or directly from clean data. Repeating this process $K$ times enables effective identification of honest clients. Moreover, H+ maintains low computational complexity, with an analytical time complexity of $mathcal{O}(KMr)$, where $M$ is the number of clients and $Kr ll p$. Comprehensive experiments validate H+ as a state-of-the-art (SOTA) method, demonstrating substantial robustness improvements over existing approaches under varying Byzantine attack ratios and multiple types of traditional Byzantine attacks, across all evaluated scenarios and benchmark datasets.
Problem

Research questions and friction points this paper is trying to address.

Mitigating Byzantine attacks in federated learning systems
Extending similarity-aware aggregation to no-clean-data scenarios
Maintaining low computational complexity while ensuring robustness
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses random parameter segments for similarity comparison
Applies iterative filtering to identify honest clients
Maintains low computational complexity with O(KMr)
🔎 Similar Papers
No similar papers found.
S
Shiyuan Zuo
Beijing Institute of Technology
Rongfei Fan
Rongfei Fan
Beijing Institute of Technology
Federated LearningEdge ComputingResource AllocationStatistical Signal Processing
C
Cheng Zhan
Southwest University
J
Jie Xu
The Chinese University of Hong Kong (Shenzhen)
P
Puning Zhao
Sun Yat-Sen University
H
Han Hu
Beijing Institute of Technology