Poison to Detect: Detection of Targeted Overfitting in Federated Learning

📅 2025-09-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work identifies and addresses a novel privacy threat in federated learning: a malicious orchestrator manipulates model aggregation to induce targeted overfitting on specific clients, thereby compromising model integrity. Unlike conventional defenses that focus on mitigating information leakage, we propose the first client-side proactive verification framework. It introduces three lightweight, interpretable, and real-time verifiable detection methods—label-flipping verification, backdoor trigger injection, and model fingerprint analysis—to monitor the aggregation process. Extensive experiments across multiple datasets and attack scenarios demonstrate that all three methods reliably detect targeted overfitting with low latency, low false-positive rates, and bounded computational overhead. Our approach significantly enhances clients’ autonomy in defending against integrity-preserving adversarial aggregation, establishing a new paradigm for verifiable federated learning.

Technology Category

Application Category

📝 Abstract
Federated Learning (FL) enables collaborative model training across decentralised clients while keeping local data private, making it a widely adopted privacy-enhancing technology (PET). Despite its privacy benefits, FL remains vulnerable to privacy attacks, including those targeting specific clients. In this paper, we study an underexplored threat where a dishonest orchestrator intentionally manipulates the aggregation process to induce targeted overfitting in the local models of specific clients. Whereas many studies in this area predominantly focus on reducing the amount of information leakage during training, we focus on enabling an early client-side detection of targeted overfitting, thereby allowing clients to disengage before significant harm occurs. In line with this, we propose three detection techniques - (a) label flipping, (b) backdoor trigger injection, and (c) model fingerprinting - that enable clients to verify the integrity of the global aggregation. We evaluated our methods on multiple datasets under different attack scenarios. Our results show that the three methods reliably detect targeted overfitting induced by the orchestrator, but they differ in terms of computational complexity, detection latency, and false-positive rates.
Problem

Research questions and friction points this paper is trying to address.

Detecting targeted overfitting in federated learning systems
Enabling early client-side detection of aggregation manipulation
Verifying global model integrity via three detection techniques
Innovation

Methods, ideas, or system contributions that make the work stand out.

Label flipping for integrity verification
Backdoor trigger injection detection
Model fingerprinting technique
🔎 Similar Papers
No similar papers found.
S
Soumia Zohra El Mestari
Interdisciplinary Centre for Security, Reliability and Trust, University of Luxembourg
M
Maciej Krzysztof Zuziak
KDD Lab - ISTI, National Research Council of Italy
Gabriele Lenzini
Gabriele Lenzini
Interdiscilplinary Centre for Security Reliability and Trust ( SNT) - University of Luxembourg
Sociotechnical Cybersecurity