Harvesting Private Medical Images in Federated Learning Systems with Crafted Models

📅 2024-07-13
🏛️ arXiv.org
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
Emerging privacy leakage risks in medical federated learning (FL) threaten sensitive patient data, particularly from medical imaging. Method: We propose MedLeak—a novel server-side attack that injects a learnable prefix module into the global model to manipulate client updates, inducing gradient disclosures containing sensitive image information; it bypasses cryptographic defenses (e.g., secure aggregation) via structural model tampering without breaking encryption. Contribution/Results: By integrating gradient analytical modeling with optimization-driven reconstruction, MedLeak achieves near-perfect image recovery on MedMNIST and COVIDx CXR-4—PSNR > 30 dB and SSIM > 0.9. Reconstructed images yield downstream classification accuracy statistically indistinguishable from models trained on original data. This demonstrates a critical architectural-level privacy vulnerability in current medical FL systems, revealing that model structure integrity—not just gradient confidentiality—is essential for privacy preservation.

Technology Category

Application Category

📝 Abstract
Federated learning (FL) allows a set of clients to collaboratively train a machine-learning model without exposing local training samples. In this context, it is considered to be privacy-preserving and hence has been adopted by medical centers to train machine-learning models over private data. However, in this paper, we propose a novel attack named MediLeak that enables a malicious parameter server to recover high-fidelity patient images from the model updates uploaded by the clients. MediLeak requires the server to generate an adversarial model by adding a crafted module in front of the original model architecture. It is published to the clients in the regular FL training process and each client conducts local training on it to generate corresponding model updates. Then, based on the FL protocol, the model updates are sent back to the server and our proposed analytical method recovers private data from the parameter updates of the crafted module. We provide a comprehensive analysis for MediLeak and show that it can successfully break the state-of-the-art cryptographic secure aggregation protocols, designed to protect the FL systems from privacy inference attacks. We implement MediLeak on the MedMNIST and COVIDx CXR-4 datasets. The results show that MediLeak can nearly perfectly recover private images with high recovery rates and quantitative scores. We further perform downstream tasks such as disease classification with the recovered data, where our results show no significant performance degradation compared to using the original training samples.
Problem

Research questions and friction points this paper is trying to address.

Exposes privacy risks in federated learning for medical data
Recovers private medical data from client model updates
Bypasses secure aggregation protocols in federated learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adversarially crafted model in FL
Recover private data from updates
Bypasses secure aggregation protocols
🔎 Similar Papers
No similar papers found.