Pruning as a Defense: Reducing Memorization in Large Language Models

📅 2025-02-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) inherently memorize training data, posing significant privacy risks—including data leakage and membership inference attacks. This work systematically demonstrates, for the first time, that structured pruning and sparsification—guided by weight importance estimation—effectively suppress LLMs’ memorization behavior, establishing “pruning-as-defense” as a novel privacy-preserving paradigm. Evaluated across multiple mainstream LLMs, our approach reduces memory leakage rates by 40–65% while retaining over 90% of original task performance. Crucially, experiments conducted under standardized membership inference attack benchmarks confirm that pruning is not merely a model compression technique but serves as a lightweight, general-purpose, and retraining-free foundational defense mechanism for privacy enhancement.

Technology Category

Application Category

📝 Abstract
Large language models have been shown to memorize significant portions of their training data, which they can reproduce when appropriately prompted. This work investigates the impact of simple pruning techniques on this behavior. Our findings reveal that pruning effectively reduces the extent of memorization in LLMs, demonstrating its potential as a foundational approach for mitigating membership inference attacks.
Problem

Research questions and friction points this paper is trying to address.

Reducing memorization in LLMs
Impact of pruning techniques
Mitigating membership inference attacks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pruning reduces memorization
Technique mitigates inference attacks
Foundational approach for LLMs
🔎 Similar Papers
No similar papers found.
Mansi Gupta
Mansi Gupta
Google Deepmind
Ethics in AINatural Language ProcessingMachine LearningInterpretability of Neural Models
N
Nikhar Waghela
Indian Institute of Technology, Roorkee, Roorkee, Uttarakhand, India
S
Sarthak Gupta
Indian Institute of Technology, Roorkee, Roorkee, Uttarakhand, India
S
Shourya Goel
Indian Institute of Technology, Roorkee, Roorkee, Uttarakhand, India
Sanjif Shanmugavelu
Sanjif Shanmugavelu
Groq
Deep Learning