🤖 AI Summary
This work addresses a critical limitation of existing representation similarity measures—such as CKA and SVCCA—which capture only pointwise activation overlap while ignoring geometric changes along input paths, thereby failing to distinguish models with markedly different responses to perturbations or adversarial attacks. To overcome this, the study introduces, for the first time in deep learning, the concept of holonomy from differential geometry, proposing “representation holonomy” as a gauge-invariant statistic that quantifies curvature via the cumulative rotation of features parallel-transported around infinitesimal loops in input space, thereby characterizing path dependence in representations. The method combines global whitening, shared subspace alignment, and pure-rotation Procrustes analysis to robustly estimate curvature across the full feature space. Experiments demonstrate that holonomy increases with loop radius, effectively differentiating models deemed similar by CKA, correlates strongly with adversarial and corruption robustness, and tracks the evolution of feature structure throughout training.
📝 Abstract
Deep networks learn internal representations whose geometry--how features bend, rotate, and evolve--affects both generalization and robustness. Existing similarity measures such as CKA or SVCCA capture pointwise overlap between activation sets, but miss how representations change along input paths. Two models may appear nearly identical under these metrics yet respond very differently to perturbations or adversarial stress. We introduce representation holonomy, a gauge-invariant statistic that measures this path dependence. Conceptually, holonomy quantifies the"twist"accumulated when features are parallel-transported around a small loop in input space: flat representations yield zero holonomy, while nonzero values reveal hidden curvature. Our estimator fixes gauge through global whitening, aligns neighborhoods using shared subspaces and rotation-only Procrustes, and embeds the result back to the full feature space. We prove invariance to orthogonal (and affine, post-whitening) transformations, establish a linear null for affine layers, and show that holonomy vanishes at small radii. Empirically, holonomy increases with loop radius, separates models that appear similar under CKA, and correlates with adversarial and corruption robustness. It also tracks training dynamics as features form and stabilize. Together, these results position representation holonomy as a practical and scalable diagnostic for probing the geometric structure of learned representations beyond pointwise similarity.