🤖 AI Summary
This work systematically investigates privacy risks arising during fine-tuning of large language models (LLMs), focusing on three core threats: membership inference, data extraction, and backdoor attacks. We establish the first unified analytical framework that quantitatively characterizes the fundamental trade-off between adversarial capability and defense efficacy. Through empirical evaluation of mainstream defenses—including differential privacy, federated learning, and machine unlearning—we identify their effectiveness boundaries and inherent limitations in the fine-tuning setting, revealing five critical research gaps. Building on these insights, we propose a novel, deployment-oriented paradigm for privacy-preserving LLM fine-tuning. Our framework provides both theoretical foundations and practical technical pathways for developing efficient, verifiable, and low-overhead privacy assurance systems for LLM fine-tuning.
📝 Abstract
Fine-tuning has emerged as a critical process in leveraging Large Language Models (LLMs) for specific downstream tasks, enabling these models to achieve state-of-the-art performance across various domains. However, the fine-tuning process often involves sensitive datasets, introducing privacy risks that exploit the unique characteristics of this stage. In this paper, we provide a comprehensive survey of privacy challenges associated with fine-tuning LLMs, highlighting vulnerabilities to various privacy attacks, including membership inference, data extraction, and backdoor attacks. We further review defense mechanisms designed to mitigate privacy risks in the fine-tuning phase, such as differential privacy, federated learning, and knowledge unlearning, discussing their effectiveness and limitations in addressing privacy risks and maintaining model utility. By identifying key gaps in existing research, we highlight challenges and propose directions to advance the development of privacy-preserving methods for fine-tuning LLMs, promoting their responsible use in diverse applications.