Tracing Back the Malicious Clients in Poisoning Attacks to Federated Learning

๐Ÿ“… 2024-07-09
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 3
โœจ Influential: 0
๐Ÿ“„ PDF

career value

193K/year
๐Ÿค– AI Summary
In federated learning, post-hoc attribution of poisoning attacks remains infeasible when training-time defenses fail. To address this, we propose FLForensicsโ€”the first forensic framework for federated poisoning attribution. FLForensics identifies malicious clients by analyzing the global modelโ€™s misclassification behavior on target samples, integrating gradient provenance, client-wise contribution attribution, and statistical significance testing. We provide theoretical guarantees that rigorously distinguish benign from malicious clients. Moreover, we formally model adaptive poisoning attacks for the first time and ensure robust traceability under such threats. Evaluated across five benchmark datasets, FLForensics achieves high recall (>92%) and low false positive rate (<3.5%) against both classical and adaptive poisoning attacks. Our work bridges a critical gap in post-deployment security auditing for federated learning systems.

Technology Category

Application Category

๐Ÿ“ Abstract
Poisoning attacks compromise the training phase of federated learning (FL) such that the learned global model misclassifies attacker-chosen inputs called target inputs. Existing defenses mainly focus on protecting the training phase of FL such that the learnt global model is poison free. However, these defenses often achieve limited effectiveness when the clients' local training data is highly non-iid or the number of malicious clients is large, as confirmed in our experiments. In this work, we propose FLForensics, the first poison-forensics method for FL. FLForensics complements existing training-phase defenses. In particular, when training-phase defenses fail and a poisoned global model is deployed, FLForensics aims to trace back the malicious clients that performed the poisoning attack after a misclassified target input is identified. We theoretically show that FLForensics can accurately distinguish between benign and malicious clients under a formal definition of poisoning attack. Moreover, we empirically show the effectiveness of FLForensics at tracing back both existing and adaptive poisoning attacks on five benchmark datasets.
Problem

Research questions and friction points this paper is trying to address.

Tracing malicious clients in federated learning poisoning attacks
Identifying attackers after model deployment when defenses fail
Distinguishing benign from malicious clients in poisoning scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Traces malicious clients in federated learning attacks
Uses forensic analysis after model deployment
Identifies attackers when training defenses fail
๐Ÿ”Ž Similar Papers
No similar papers found.