Privacy and Accuracy-Aware AI/ML Model Deduplication

📅 2025-03-04

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

Managing numerous differentially private (DP) model versions—e.g., trained via DP-SGD—introduces severe challenges, including storage redundancy, high inference latency, and uncontrolled privacy budget consumption. Method: This paper formally defines the DP model deduplication problem and proposes a co-optimization framework that jointly respects privacy budget constraints and prediction accuracy. Specifically: (1) it introduces a privacy–accuracy trade-off-aware greedy selection strategy; (2) integrates the Sparse Vector Technique (SVT) to dynamically verify accuracy while compressing privacy cost; and (3) employs parameter sharing and incremental encoding to enhance model reuse efficiency. Results: Experiments on large language models (LLMs) and Vision Transformers (ViTs) demonstrate up to 35× storage compression and 43× inference speedup per model, significantly reducing I/O overhead and cumulative privacy budget expenditure.

Technology Category

Application Category

📝 Abstract

With the growing adoption of privacy-preserving machine learning algorithms, such as Differentially Private Stochastic Gradient Descent (DP-SGD), training or fine-tuning models on private datasets has become increasingly prevalent. This shift has led to the need for models offering varying privacy guarantees and utility levels to satisfy diverse user requirements. However, managing numerous versions of large models introduces significant operational challenges, including increased inference latency, higher resource consumption, and elevated costs. Model deduplication is a technique widely used by many model serving and database systems to support high-performance and low-cost inference queries and model diagnosis queries. However, none of the existing model deduplication works has considered privacy, leading to unbounded aggregation of privacy costs for certain deduplicated models and inefficiencies when applied to deduplicate DP-trained models. We formalize the problems of deduplicating DP-trained models for the first time and propose a novel privacy- and accuracy-aware deduplication mechanism to address the problems. We developed a greedy strategy to select and assign base models to target models to minimize storage and privacy costs. When deduplicating a target model, we dynamically schedule accuracy validations and apply the Sparse Vector Technique to reduce the privacy costs associated with private validation data. Compared to baselines that do not provide privacy guarantees, our approach improved the compression ratio by up to $35 imes$ for individual models (including large language models and vision transformers). We also observed up to $43 imes$ inference speedup due to the reduction of I/O operations.

Problem

Research questions and friction points this paper is trying to address.

Addresses privacy and accuracy in deduplicating DP-trained models.

Reduces storage and privacy costs in model deduplication.

Improves compression ratio and inference speed for large models.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Privacy-aware deduplication for DP-trained models

Greedy strategy minimizes storage and privacy costs

Dynamic accuracy validation reduces privacy costs

🔎 Similar Papers

Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions