Unlearning the Unpromptable: Prompt-free Instance Unlearning in Diffusion Models

📅 2026-03-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of precisely unlearning undesirable outputs in diffusion models—such as specific faces or culturally inappropriate representations—that cannot be reliably specified via textual prompts. To this end, the authors propose a prompt-free instance unlearning method that integrates image editing, timestep-aware weighting, and gradient surgery to selectively erase non-promptable target content from models like Stable Diffusion 3 and DDPM-CelebA. Experimental results demonstrate that the approach effectively removes designated instances in both conditional and unconditional diffusion models, significantly outperforming existing prompt-based and prompt-free baselines while preserving overall generative fidelity. This provides a practical hotfix solution for privacy preservation and ethical compliance without requiring full model retraining.

Technology Category

Application Category

📝 Abstract
Machine unlearning aims to remove specific outputs from trained models, often at the concept level, such as forgetting all occurrences of a particular celebrity or filtering content via text prompts. However, many undesired outputs, such as an individual's face or generations culturally or factually misinterpreted, cannot often be specified by text prompts. We address this underexplored setting of instance unlearning for outputs that are undesired but unpromptable, where the goal is to forget target outputs selectively while preserving the rest. To this end, we introduce an effective surrogate-based unlearning method that leverages image editing, timestep-aware weighting, and gradient surgery to guide trained diffusion models toward forgetting specific outputs. Experiments on conditional (Stable Diffusion 3) and unconditional (DDPM-CelebA) diffusion models demonstrate that our prompt-free method uniquely unlearns unpromptable outputs, such as faces and culturally inaccurate depictions, with preserved integrity, unlike prompt-based and prompt-free baselines. Our proposed method would serve as a practical hotfix for diffusion model providers to ensure privacy protection and ethical compliance.
Problem

Research questions and friction points this paper is trying to address.

machine unlearning
diffusion models
instance unlearning
unpromptable outputs
privacy protection
Innovation

Methods, ideas, or system contributions that make the work stand out.

prompt-free unlearning
instance unlearning
diffusion models
gradient surgery
timestep-aware weighting
🔎 Similar Papers
No similar papers found.
K
Kyungryeol Lee
Department of Electrical and Computer Engineering (ECE), Seoul National University (SNU), Seoul, 08826, Republic of Korea
K
Kyeonghyun Lee
Department of Electrical and Computer Engineering (ECE), Seoul National University (SNU), Seoul, 08826, Republic of Korea
Seongmin Hong
Seongmin Hong
HyperAccel
Computer ArchitectureAI/ML Accelerator
B
Byung Hyun Lee
Department of Electrical and Computer Engineering (ECE), Seoul National University (SNU), Seoul, 08826, Republic of Korea
Se Young Chun
Se Young Chun
Department of Electrical and Computer Engineering, Seoul National University
computational imagingmachine learningsignal processingmultimodal processing