🤖 AI Summary
This work identifies a critical membership inference attack (MIA) risk arising from fine-tuning large language models (LLMs) in zero-shot learning (ZSL) settings: attackers can infer whether individual samples were present in the training data with up to 92% accuracy—particularly in downstream tasks such as named entity recognition—posing severe privacy threats. We present the first systematic formalization of MIA mechanisms under ZSL, and propose a privacy threat assessment framework tailored to communication-computation co-design scenarios. To mitigate this risk, we devise a synergistic defense strategy integrating gradient masking and output perturbation, specifically adapted to ZSL’s architectural constraints. Empirical evaluation demonstrates that our approach significantly reduces attack success rates while preserving downstream task performance, offering a practical, deployable privacy-preserving solution for LLM services operating in zero-shot regimes.
📝 Abstract
Recently, large language models (LLMs) have been gaining a lot of interest due to their adaptability and extensibility in emerging applications, including communication networks. It is anticipated that ZSM networks will be able to support LLMs as a service, as they provide ultra reliable low-latency communications and closed loop massive connectivity. However, LLMs are vulnerable to data and model privacy issues that affect the trustworthiness of LLMs to be deployed for user-based services. In this paper, we explore the security vulnerabilities associated with fine-tuning LLMs in ZSM networks, in particular the membership inference attack. We define the characteristics of an attack network that can perform a membership inference attack if the attacker has access to the fine-tuned model for the downstream task. We show that the membership inference attacks are effective for any downstream task, which can lead to a personal data breach when using LLM as a service. The experimental results show that the attack success rate of maximum 92% can be achieved on named entity recognition task. Based on the experimental analysis, we discuss possible defense mechanisms and present possible research directions to make the LLMs more trustworthy in the context of ZSM networks.