🤖 AI Summary
This work addresses the critical challenge of auditing whether domain-adapted large language models—particularly those fine-tuned via parameter-efficient methods such as LoRA—have been trained on specific data samples, a capability essential for safeguarding intellectual property and sensitive information. The authors propose LoRA-MINT, the first scalable and efficient training data auditing framework tailored for LoRA-finetuned models. By systematically analyzing the relationship between model perplexity and membership status, LoRA-MINT enables reliable inference of the origins of fine-tuning data. The method is generalizable and extensible to other parameter-efficient fine-tuning paradigms. Extensive experiments across four prominent language models and three benchmark datasets demonstrate precision scores ranging from 0.77 to 0.92, substantially outperforming existing baselines and confirming the approach’s robustness and broad applicability.
📝 Abstract
We present LoRA-MINT, a new methodology for Membership Inference Test (MINT) applied to recent Large Language Models (LLMs) fine-tuned for specific Natural Language Processing (NLP) tasks through Low-Rank Adaptation (LoRA). The primary goal is to assess whether individual samples were part of the training data of these adapted models, providing a useful auditing tool for the management of intellectual property and sensitive data. Our analysis explores the relationship between model perplexity and membership status, providing a systematic framework for estimating data exposure in fine-tuned LLMs. We conducted experiments on four models and three benchmark datasets, obtaining precision values in determining if given data were used for training ranging from 0.77 to 0.92, which outperform state-of-the-art baselines and demonstrate the robustness and generality of the proposed method. In general, our findings underscore the potential of LoRA-MINT as an effective and scalable framework for auditing LLMs, improving transparency, and fostering the ethical and responsible deployment of AI and NLP technologies. For the sake of concreteness and current relevance, our discussion and experiments are centered on LoRAadjusted LLMs, but note that most of the presented methodology is easily applicable for auditing training data given any other technique for adapting LLMs or, more generally, any other domain-adapted AI models.