Directional Neural Collapse Explains Few-Shot Transfer in Self-Supervised Learning

📅 2026-03-03

📈 Citations: 0

✨ Influential: 0

career value

162K/year

🤖 AI Summary

This work investigates why frozen representations from self-supervised learning enable efficient few-shot transfer across tasks. By introducing directional CDNV (Class-wise Decision-axis Normalized Variance)—a novel geometric metric—it unifies the modeling of variance along decision-axis directions with few-shot generalization and low interference in multitask settings, thereby overcoming limitations of classical neural collapse theory. Through non-asymptotic multiclass generalization bound analysis, the study establishes a tight theoretical link between directional CDNV and empirical generalization error. Both theoretical and empirical results demonstrate that self-supervised pretraining induces a significant collapse in directional CDNV, while decision axes for distinct tasks become approximately orthogonal, collectively enabling high-capacity yet low-interference multitask learning.

Technology Category

Application Category

📝 Abstract

Frozen self-supervised representations often transfer well with only a few labels across many semantic tasks. We argue that a single geometric quantity, \emph{directional} CDNV (decision-axis variance), sits at the core of two favorable behaviors: strong few-shot transfer within a task, and low interference across many tasks. We show that both emerge when variability \emph{along} class-separating directions is small. First, we prove sharp non-asymptotic multiclass generalization bounds for downstream classification whose leading term is the directional CDNV. The bounds include finite-shot corrections that cleanly separate intrinsic decision-axis variability from centroid-estimation error. Second, we link decision-axis collapse to multitask geometry: for independent balanced labelings, small directional CDNV across tasks forces the corresponding decision axes to be nearly orthogonal, helping a single representation support many tasks with minimal interference. Empirically, across SSL objectives, directional CDNV collapses during pretraining even when classical CDNV remains large, and our bounds closely track few-shot error at practical shot sizes. Additionally, on synthetic multitask data, we verify that SSL learns representations whose induced decision axes are nearly orthogonal. The code and project page of the paper are available at [\href{https://dlfundamentals.github.io/directional-neural-collapse/}{project page}].

Problem

Research questions and friction points this paper is trying to address.

few-shot transfer

self-supervised learning

directional neural collapse

decision-axis variance

multitask interference

Innovation

Methods, ideas, or system contributions that make the work stand out.

directional neural collapse

few-shot transfer

self-supervised learning