Impact of Regularization on Calibration and Robustness: from the Representation Space Perspective

📅 2024-10-05

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work investigates the geometric mechanisms underlying soft-label regularization—specifically label smoothing, Mixup, and CutMix—in improving model calibration and adversarial robustness for image classification. From a representation-space geometry perspective, we conduct deep feature analysis, class-center modeling, and cosine similarity quantification. We find that all three regularizers preserve the conical topology of decision regions while consistently compressing feature magnitudes and increasing cosine similarity between features and class centers. Crucially, we identify for the first time the geometric invariance of this conical structure under soft-label regularization. Through causal analysis, we establish that magnitude compression predominantly reduces calibration error, whereas enhanced cosine similarity drives improved adversarial robustness. These findings yield an interpretable geometric framework for understanding how soft-label regularization operates, bridging representation geometry with model reliability properties.

Technology Category

Application Category

📝 Abstract

Recent studies have shown that regularization techniques using soft labels, e.g., label smoothing, Mixup, and CutMix, not only enhance image classification accuracy but also improve model calibration and robustness against adversarial attacks. However, the underlying mechanisms of such improvements remain underexplored. In this paper, we offer a novel explanation from the perspective of the representation space (i.e., the space of the features obtained at the penultimate layer). Our investigation first reveals that the decision regions in the representation space form cone-like shapes around the origin after training regardless of the presence of regularization. However, applying regularization causes changes in the distribution of features (or representation vectors). The magnitudes of the representation vectors are reduced and subsequently the cosine similarities between the representation vectors and the class centers (minimal loss points for each class) become higher, which acts as a central mechanism inducing improved calibration and robustness. Our findings provide new insights into the characteristics of the high-dimensional representation space in relation to training and regularization using soft labels.

Problem

Research questions and friction points this paper is trying to address.

Investigating how regularization improves model calibration and adversarial robustness

Analyzing representation space structure and decision boundaries for regularization effects

Explaining mechanisms behind soft label regularization from feature distribution perspective

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzes regularization effects via representation space geometry

Examines feature distribution shifts from soft label techniques

Links gradient directions to improved calibration and robustness

🔎 Similar Papers

Calibration in Deep Learning: A Survey of the State-of-the-Art