SPACE: Source-free Proxy Anchor Concept Erasure for MLLMs

📅 2026-05-31

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenge of effective machine unlearning in multimodal large language models when access to the original sensitive data is unavailable. To this end, it introduces the first source-agnostic unlearning framework, which leverages Text-guided Proxy Anchor Selection (TPAS) to acquire semantically aligned proxy samples and optimizes these anchors through Dual-Constrained Semantic Isolation (DCSI) under a null-space constraint. This approach indirectly erases target concepts while preserving the overall knowledge structure. Theoretical analysis demonstrates that the method induces bounded perturbations to retained knowledge and maximizes feature spectral entropy. Extensive experiments across six benchmark datasets show that the proposed framework achieves performance on par with state-of-the-art methods that rely on original training data, marking the first successful realization of efficient source-agnostic machine unlearning for multimodal large language models.

📝 Abstract

As Multimodal Large Language Models (MLLMs) face growing privacy risks and regulatory constraints, machine unlearning (MU) has emerged as a crucial solution for removing sensitive data while preserving model performance. However, existing MU methods typically rely on visual data of the target concepts, which is often unavailable due to strict data retention policies, thus creating a demand for source-free unlearning approaches that operate without access to the target data. In this work, we propose Source-free Proxy Anchor Concept Erasure (SPACE), the first source-free unlearning framework specialized for MLLMs. SPACE consists of two stages: (1) Text-Guided Proxy Anchor Selection (TPAS), which retrieves semantically aligned proxy anchors from the shared feature space. (2) Dual-Constraint Semantic Isolation (DCSI), which optimizes these anchors to indirectly erase target concepts. DCSI confines updates to the null space of retained knowledge, ensuring structural integrity. We theoretically prove that SPACE strictly bounds the perturbation on retained knowledge and maximizes feature spectral entropy, thereby maintaining the model's performance. Furthermore, extensive experiments across six datasets show that SPACE achieves performance comparable to that of state-of-the-art data-dependent methods, validating its effectiveness in source-free MU scenarios. The source code will be released.

Problem

Research questions and friction points this paper is trying to address.

machine unlearning

source-free

multimodal large language models

concept erasure

data privacy

Innovation

Methods, ideas, or system contributions that make the work stand out.

source-free unlearning

multimodal large language models

proxy anchor