Alignment Calibration: Machine Unlearning for Contrastive Learning under Auditing

๐Ÿ“… 2024-06-05
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 1
โœจ Influential: 1
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work presents the first systematic study of machine unlearning for contrastive learning (CL) models, identifying two critical gaps: the failure of existing unlearning methods under the CL paradigm and the absence of appropriate evaluation protocols. To address these, we propose MUC, a CL-specific unlearning framework whose core innovation is Alignment Calibrationโ€”a principled, auditable optimization objective leveraging CLโ€™s alignment property. MUC achieves controllable, representation-level forgetting after sensitive data removal via contrastive loss recalibration and embedding-space alignment constraints. The framework is architecture-agnostic, supporting mainstream CL models including SimCLR, MoCo, and CLIP, and introduces novel audit metrics enabling black-box verification and visual interpretability. Extensive experiments demonstrate that MUC consistently approaches full retraining performance across multiple benchmarks, substantially outperforming prior unlearning methods. Notably, MUC establishes the first verifiable and interpretable data unlearning capability for CL models.

Technology Category

Application Category

๐Ÿ“ Abstract
Machine unlearning provides viable solutions to revoke the effect of certain training data on pre-trained model parameters. Existing approaches provide unlearning recipes for classification and generative models. However, a category of important machine learning models, i.e., contrastive learning (CL) methods, is overlooked. In this paper, we fill this gap by first proposing the framework of Machine Unlearning for Contrastive learning (MUC) and adapting existing methods. Furthermore, we observe that several methods are mediocre unlearners and existing auditing tools may not be sufficient for data owners to validate the unlearning effects in contrastive learning. We thus propose a novel method called Alignment Calibration (AC) by explicitly considering the properties of contrastive learning and optimizing towards novel auditing metrics to easily verify unlearning. We empirically compare AC with baseline methods on SimCLR, MoCo and CLIP. We observe that AC addresses drawbacks of existing methods: (1) achieving state-of-the-art performance and approximating exact unlearning (retraining); (2) allowing data owners to clearly visualize the effect caused by unlearning through black-box auditing.
Problem

Research questions and friction points this paper is trying to address.

Machine unlearning for contrastive learning models
Evaluating unlearning effects with black-box methods
Addressing limitations in current unlearning validation approaches
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes Alignment Calibration method
Optimizes new auditing metrics
Enables black-box evaluation visualization