Why Prototypes Collapse: Diagnosing and Preventing Partial Collapse in Prototypical Self-Supervised Learning

📅 2025-10-22

📈 Citations: 0

✨ Influential: 0

career value

164K/year

🤖 AI Summary

In prototype-based self-supervised learning, partial prototype collapse—where multiple prototypes converge to similar representations—is a prevalent issue that undermines the ability of prototypes to guide the encoder in learning diverse features. This paper identifies, for the first time, joint optimization of the encoder and prototypes as the root cause of collapse. To address this, we propose a fully decoupled training paradigm: prototype learning is strictly separated from encoder optimization. Prototypes are updated independently via a Gaussian Mixture Model (GMM) coupled with an online Expectation-Maximization (EM) algorithm—requiring no explicit regularization or over-parameterization. This mechanism eliminates collapse at its source, substantially enhancing prototype diversity and representation discriminability. Empirically, our approach yields more stable and consistently superior performance across downstream tasks, outperforming state-of-the-art methods without architectural or loss-function modifications.

Technology Category

Application Category

📝 Abstract

Prototypical self-supervised learning methods consistently suffer from partial prototype collapse, where multiple prototypes converge to nearly identical representations. This undermines their central purpose -- providing diverse and informative targets to guide encoders toward rich representations -- and has led practitioners to over-parameterize prototype sets or add ad-hoc regularizers, which mitigate symptoms rather than address the root cause. We empirically trace the collapse to the joint optimization of encoders and prototypes, which encourages a type of shortcut learning: early in training prototypes drift toward redundant representations that minimize loss without necessarily enhancing representation diversity. To break the joint optimization, we introduce a fully decoupled training strategy that learns prototypes and encoders under separate objectives. Concretely, we model prototypes as a Gaussian mixture updated with an online EM-style procedure, independent of the encoder's loss. This simple yet principled decoupling eliminates prototype collapse without explicit regularization and yields consistently diverse prototypes and stronger downstream performance.

Problem

Research questions and friction points this paper is trying to address.

Diagnosing causes of prototype collapse in self-supervised learning

Preventing redundant prototype representations through decoupled training

Eliminating collapse without regularization via separate optimization objectives

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decoupled training strategy for prototypes and encoders

Gaussian mixture prototypes updated via online EM

Separate objectives eliminate collapse without regularization

🔎 Similar Papers

Failure-Proof Non-Contrastive Self-Supervised Learning