POUR: A Provably Optimal Method for Unlearning Representations via Neural Collapse

📅 2025-11-24

📈 Citations: 0

✨ Influential: 0

career value

162K/year

🤖 AI Summary

This work addresses the limited effectiveness of machine unlearning at the visual representation level by proposing the first representation-space-oriented unlearning framework that requires no retraining. Unlike existing methods that only modify the classifier, our approach extends unlearning to the feature representation layer, enabling complete elimination of specific concepts via geometric projection. Leveraging the neural collapse theory, we prove that the Simplex Equiangular Tight Frame (ETF) preserves structural invariance under orthogonal projection, enabling derivation of a provably optimal unlearning operator. We further introduce the Representation Unlearning Score (RUS) to quantify unlearning extent and develop two closed-form projection methods: POUR-P (projection-only) and its distillation-enhanced variant POUR-D. Extensive experiments on CIFAR-10/100 and PathMNIST demonstrate that our framework significantly outperforms state-of-the-art methods in both classification accuracy and representation fidelity.

Technology Category

Application Category

📝 Abstract

In computer vision, machine unlearning aims to remove the influence of specific visual concepts or training images without retraining from scratch. Studies show that existing approaches often modify the classifier while leaving internal representations intact, resulting in incomplete forgetting. In this work, we extend the notion of unlearning to the representation level, deriving a three-term interplay between forgetting efficacy, retention fidelity, and class separation. Building on Neural Collapse theory, we show that the orthogonal projection of a simplex Equiangular Tight Frame (ETF) remains an ETF in a lower dimensional space, yielding a provably optimal forgetting operator. We further introduce the Representation Unlearning Score (RUS) to quantify representation-level forgetting and retention fidelity. Building on this, we introduce POUR (Provably Optimal Unlearning of Representations), a geometric projection method with closed-form (POUR-P) and a feature-level unlearning variant under a distillation scheme (POUR-D). Experiments on CIFAR-10/100 and PathMNIST demonstrate that POUR achieves effective unlearning while preserving retained knowledge, outperforming state-of-the-art unlearning methods on both classification-level and representation-level metrics.

Problem

Research questions and friction points this paper is trying to address.

Removing influence of specific visual concepts without full retraining

Addressing incomplete forgetting in existing machine unlearning approaches

Extending unlearning to representation level with optimal forgetting operator

Innovation

Methods, ideas, or system contributions that make the work stand out.

Geometric projection method for representation unlearning

Orthogonal projection of simplex ETF structure

Closed-form and distillation-based unlearning variants

🔎 Similar Papers

No similar papers found.