Progressive Conditioned Scale-Shift Recalibration of Self-Attention for Online Test-time Adaptation

📅 2025-12-14

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address domain-induced self-attention feature shifts during online test-time adaptation (TTA) of Transformer models, this paper proposes a layer-wise progressive conditional scale-shift recalibration mechanism. The method models domain shift as a layer-wise progressive separation process and introduces two lightweight, differentiable networks—domain separation network and factor generation network—that operate online to dynamically predict layer-specific conditional scale and shift parameters for each self-attention module. These parameters are then applied via efficient local linear transformations to recalibrate features. Crucially, the approach requires no access to source-domain data or labels and operates entirely online. Evaluated on benchmarks including ImageNet-C, it achieves up to a 3.9% improvement in classification accuracy, significantly outperforming existing online TTA methods. Key contributions include: (i) the first formulation of domain shift as a progressive, layer-wise separation process; (ii) a fully online, parameter-efficient recalibration framework; and (iii) state-of-the-art performance without source data dependency.

Technology Category

Application Category

📝 Abstract

Online test-time adaptation aims to dynamically adjust a network model in real-time based on sequential input samples during the inference stage. In this work, we find that, when applying a transformer network model to a new target domain, the Query, Key, and Value features of its self-attention module often change significantly from those in the source domain, leading to substantial performance degradation of the transformer model. To address this important issue, we propose to develop a new approach to progressively recalibrate the self-attention at each layer using a local linear transform parameterized by conditioned scale and shift factors. We consider the online model adaptation from the source domain to the target domain as a progressive domain shift separation process. At each transformer network layer, we learn a Domain Separation Network to extract the domain shift feature, which is used to predict the scale and shift parameters for self-attention recalibration using a Factor Generator Network. These two lightweight networks are adapted online during inference. Experimental results on benchmark datasets demonstrate that the proposed progressive conditioned scale-shift recalibration (PCSR) method is able to significantly improve the online test-time domain adaptation performance by a large margin of up to 3.9% in classification accuracy on the ImageNet-C dataset.

Problem

Research questions and friction points this paper is trying to address.

Addresses performance degradation in transformers during online domain adaptation.

Recalibrates self-attention via conditioned scale-shift factors for real-time adjustment.

Improves classification accuracy on target domains like ImageNet-C by up to 3.9%.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Progressive recalibration of self-attention using conditioned scale-shift factors

Online adaptation via lightweight Domain Separation and Factor Generator networks

Treats domain adaptation as a progressive domain shift separation process

🔎 Similar Papers

No similar papers found.

Authors to Follow