Alignment Calibration: Machine Unlearning for Contrastive Learning under Auditing

📅 2024-06-05

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 1

career value

165K/year

🤖 AI Summary

This work presents the first systematic study of machine unlearning for contrastive learning (CL) models, identifying two critical gaps: the failure of existing unlearning methods under the CL paradigm and the absence of appropriate evaluation protocols. To address these, we propose MUC, a CL-specific unlearning framework whose core innovation is Alignment Calibration—a principled, auditable optimization objective leveraging CL’s alignment property. MUC achieves controllable, representation-level forgetting after sensitive data removal via contrastive loss recalibration and embedding-space alignment constraints. The framework is architecture-agnostic, supporting mainstream CL models including SimCLR, MoCo, and CLIP, and introduces novel audit metrics enabling black-box verification and visual interpretability. Extensive experiments demonstrate that MUC consistently approaches full retraining performance across multiple benchmarks, substantially outperforming prior unlearning methods. Notably, MUC establishes the first verifiable and interpretable data unlearning capability for CL models.

Technology Category

Application Category

📝 Abstract

Machine unlearning provides viable solutions to revoke the effect of certain training data on pre-trained model parameters. Existing approaches provide unlearning recipes for classification and generative models. However, a category of important machine learning models, i.e., contrastive learning (CL) methods, is overlooked. In this paper, we fill this gap by first proposing the framework of Machine Unlearning for Contrastive learning (MUC) and adapting existing methods. Furthermore, we observe that several methods are mediocre unlearners and existing auditing tools may not be sufficient for data owners to validate the unlearning effects in contrastive learning. We thus propose a novel method called Alignment Calibration (AC) by explicitly considering the properties of contrastive learning and optimizing towards novel auditing metrics to easily verify unlearning. We empirically compare AC with baseline methods on SimCLR, MoCo and CLIP. We observe that AC addresses drawbacks of existing methods: (1) achieving state-of-the-art performance and approximating exact unlearning (retraining); (2) allowing data owners to clearly visualize the effect caused by unlearning through black-box auditing.

Problem

Research questions and friction points this paper is trying to address.

Machine unlearning for contrastive learning models

Evaluating unlearning effects with black-box methods

Addressing limitations in current unlearning validation approaches

Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes Alignment Calibration method

Optimizes new auditing metrics

Enables black-box evaluation visualization

🔎 Similar Papers

Learn while Unlearn: An Iterative Unlearning Framework for Generative Language Models