🤖 AI Summary
To address inference attacks—such as membership inference and gradient inversion—arising from gradient leakage in federated fine-tuning of large language models (LLMs), and to mitigate the accuracy degradation caused by differential privacy, this paper proposes a novel framework balancing strong privacy guarantees with high efficiency. Our method innovatively integrates LoRA-based low-rank adaptation with fully homomorphic encryption (FHE), augmented by parameter pruning and an optimized cross-institution federated algorithm. Crucially, it achieves end-to-end privacy protection without sharing raw gradients locally. Experiments demonstrate that model accuracy is preserved nearly unchanged, while resilience against inference attacks is significantly enhanced. Communication and computational overhead are reduced by over 40% compared to baseline approaches. The framework achieves state-of-the-art (SOTA) privacy guarantees under rigorous cryptographic assumptions and is specifically designed to operate efficiently on resource-constrained edge devices typical of small-scale institutions.
📝 Abstract
Federated Learning (FL) offers a decentralized framework for training and fine-tuning Large Language Models (LLMs) by leveraging computational resources across organizations while keeping sensitive data on local devices. It addresses privacy and security concerns while navigating challenges associated with the substantial computational demands of LLMs, which can be prohibitive for small and medium-sized organizations. FL supports the development of task-specific LLMs for cross-silo applications through fine-tuning but remains vulnerable to inference attacks, such as membership inference and gradient inversion, which threaten data privacy. Prior studies have utilized Differential Privacy (DP) in LLM fine-tuning, which, despite being effective at preserving privacy, can degrade model performance. To overcome these challenges, we propose a novel method, FedShield-LLM, that uses pruning with Fully Homomorphic Encryption (FHE) for Low-Rank Adaptation (LoRA) parameters, enabling secure computations on encrypted model updates while mitigating the attack surface by deactivating less important LoRA parameters. Furthermore, optimized federated algorithms for cross-silo environments enhance scalability and efficiency. Parameter-efficient fine-tuning techniques like LoRA substantially reduce computational and communication overhead, making FL feasible for resource-constrained clients. Experimental results show that the proposed method outperforms existing methods while maintaining robust privacy protection, enabling organizations to collaboratively train secure and efficient LLMs. The code and data are available at, https://github.com/solidlabnetwork/fedshield-llm