🤖 AI Summary
This work addresses the vulnerability of high-confidence adversarial training to non-causal background correlations, which can induce overfitting and impair robust generalization. To mitigate this issue, the authors propose HICAT, a novel framework that adaptively evaluates the utility of visual context through a “measure–debias–align” pipeline, enabling precise logit calibration and disentanglement of foreground features. The key innovations include the first identification of the dual role of background signals in high-confidence predictions, a semantic balancing mechanism to prevent feature degradation from indiscriminate suppression, and the introduction of a learnable background bias estimator (LBBE), an adaptive debiasing module, and a foreground logit orthogonal enhancement (FLOE) loss. Experiments demonstrate that HICAT significantly outperforms existing methods on CIFAR-10/100 and ImageNet-1K, effectively narrowing the robust generalization gap while remaining compatible with both CNN and ViT architectures.
📝 Abstract
Inverse adversarial training leverages high-confidence predictions to stabilize robust learning, yet we uncover a critical paradox: high confidence often stems from overfitting to non-causal background correlations rather than intrinsic object semantics. Our investigation reveals that visual context functions as a dual-natured signal, serving as either a necessary supportive prior or a spurious confounder. This insight renders existing blind suppression strategies flawed, as they inevitably lead to severe Feature Loss. To resolve this, we propose High-Confidence Causally Aligned Training (HICAT), a unified framework that establishes a Semantic Equilibrium. Operating on a ``Measure-Debias-Align'' pipeline, HICAT integrates a Learnable Background-Bias Estimator (LBBE) to adaptively diagnose context utility. Guided by this diagnosis, an Adaptive Debiasing mechanism performs surgical logit rectification, complemented by a geometrically grounded Foreground Logit Orthogonal Enhancement (FLOE) loss to enforce rigorous feature disentanglement. Extensive experiments on CIFAR-10, CIFAR-100, and ImageNet-1K demonstrate that HICAT consistently improves over matched baselines across diverse architectures (CNNs and ViTs) while significantly reducing the robust generalization gap.