Tracing Back the Malicious Clients in Poisoning Attacks to Federated Learning

📅 2024-07-09

🏛️ arXiv.org

📈 Citations: 3

✨ Influential: 0

career value

193K/year

🤖 AI Summary

In federated learning, post-hoc attribution of poisoning attacks remains infeasible when training-time defenses fail. To address this, we propose FLForensics—the first forensic framework for federated poisoning attribution. FLForensics identifies malicious clients by analyzing the global model’s misclassification behavior on target samples, integrating gradient provenance, client-wise contribution attribution, and statistical significance testing. We provide theoretical guarantees that rigorously distinguish benign from malicious clients. Moreover, we formally model adaptive poisoning attacks for the first time and ensure robust traceability under such threats. Evaluated across five benchmark datasets, FLForensics achieves high recall (>92%) and low false positive rate (<3.5%) against both classical and adaptive poisoning attacks. Our work bridges a critical gap in post-deployment security auditing for federated learning systems.

Technology Category

Application Category

📝 Abstract

Poisoning attacks compromise the training phase of federated learning (FL) such that the learned global model misclassifies attacker-chosen inputs called target inputs. Existing defenses mainly focus on protecting the training phase of FL such that the learnt global model is poison free. However, these defenses often achieve limited effectiveness when the clients' local training data is highly non-iid or the number of malicious clients is large, as confirmed in our experiments. In this work, we propose FLForensics, the first poison-forensics method for FL. FLForensics complements existing training-phase defenses. In particular, when training-phase defenses fail and a poisoned global model is deployed, FLForensics aims to trace back the malicious clients that performed the poisoning attack after a misclassified target input is identified. We theoretically show that FLForensics can accurately distinguish between benign and malicious clients under a formal definition of poisoning attack. Moreover, we empirically show the effectiveness of FLForensics at tracing back both existing and adaptive poisoning attacks on five benchmark datasets.

Problem

Research questions and friction points this paper is trying to address.

Tracing malicious clients in federated learning poisoning attacks

Identifying attackers after model deployment when defenses fail

Distinguishing benign from malicious clients in poisoning scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

Traces malicious clients in federated learning attacks

Uses forensic analysis after model deployment

Identifies attackers when training defenses fail

🔎 Similar Papers

No similar papers found.