Realistic Image-to-Image Machine Unlearning via Decoupling and Knowledge Retention

📅 2025-02-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses machine unlearning for image-to-image (I2I) generative models, targeting verifiable and theoretically grounded privacy preservation: forgotten samples must be reliably identified as out-of-distribution (OOD) data—not merely degraded into meaningless noise. Methodologically, we propose the first gradient-ascent-based parameter decoupling mechanism to selectively strip target-sample patterns from model weights; integrate an $(varepsilon,delta)$-differentially private unlearning analysis for formal guarantees; employ adversarial attack models to empirically validate unlearning completeness; and incorporate knowledge-preserving fine-tuning to retain generation fidelity. On ImageNet-1K and Places365, our method achieves forgetting performance nearly matching full retraining. On CIFAR-10, it demonstrates significantly superior generalization robustness compared to baselines including AutoEncoder-based approaches. The framework thus bridges theoretical rigor, empirical verifiability, and practical utility in I2I unlearning.

Technology Category

Application Category

📝 Abstract
Machine Unlearning allows participants to remove their data from a trained machine learning model in order to preserve their privacy, and security. However, the machine unlearning literature for generative models is rather limited. The literature for image-to-image generative model (I2I model) considers minimizing the distance between Gaussian noise and the output of I2I model for forget samples as machine unlearning. However, we argue that the machine learning model performs fairly well on unseen data i.e., a retrained model will be able to catch generic patterns in the data and hence will not generate an output which is equivalent to Gaussian noise. In this paper, we consider that the model after unlearning should treat forget samples as out-of-distribution (OOD) data, i.e., the unlearned model should no longer recognize or encode the specific patterns found in the forget samples. To achieve this, we propose a framework which decouples the model parameters with gradient ascent, ensuring that forget samples are OOD for unlearned model with theoretical guarantee. We also provide $(epsilon, delta)$-unlearning guarantee for model updates with gradient ascent. The unlearned model is further fine-tuned on the remaining samples to maintain its performance. We also propose an attack model to ensure that the unlearned model has effectively removed the influence of forget samples. Extensive empirical evaluation on two large-scale datasets, ImageNet-1K and Places365 highlights the superiority of our approach. To show comparable performance with retrained model, we also show the comparison of a simple AutoEncoder on various baselines on CIFAR-10 dataset.
Problem

Research questions and friction points this paper is trying to address.

Enhancing privacy via machine unlearning
Treating forget samples as OOD data
Ensuring model performance post-unlearning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decouples model parameters via gradient ascent
Ensures forget samples are out-of-distribution
Fine-tunes model to maintain performance post-unlearning
🔎 Similar Papers
No similar papers found.