NEO: No-Optimization Test-Time Adaptation through Latent Re-Centering

📅 2025-10-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing test-time adaptation (TTA) methods suffer from high computational overhead, heavy reliance on large volumes of target-domain data, or sensitivity to hyperparameters. This paper proposes NEO—a gradient-free, zero-overhead TTA method that incurs no additional computation or memory cost during inference. NEO leverages the geometric structure of Vision Transformer (ViT) latent spaces: it re-centers the feature embeddings of a single batch of target samples to the origin, enabling hyperparameter-free, plug-and-play adaptation. On ImageNet-C, NEO boosts ViT-Base accuracy from 55.6% to 59.2%, accelerates inference by 63%, and reduces memory usage by 9%. It consistently outperforms state-of-the-art TTA approaches across multiple benchmarks—including ImageNet-R—and achieves improved classification performance across 999 classes under cross-class transfer settings.

Technology Category

Application Category

📝 Abstract
Test-Time Adaptation (TTA) methods are often computationally expensive, require a large amount of data for effective adaptation, or are brittle to hyperparameters. Based on a theoretical foundation of the geometry of the latent space, we are able to significantly improve the alignment between source and distribution-shifted samples by re-centering target data embeddings at the origin. This insight motivates NEO -- a hyperparameter-free fully TTA method, that adds no significant compute compared to vanilla inference. NEO is able to improve the classification accuracy of ViT-Base on ImageNet-C from 55.6% to 59.2% after adapting on just one batch of 64 samples. When adapting on 512 samples NEO beats all 7 TTA methods we compare against on ImageNet-C, ImageNet-R and ImageNet-S and beats 6/7 on CIFAR-10-C, while using the least amount of compute. NEO performs well on model calibration metrics and additionally is able to adapt from 1 class to improve accuracy on 999 other classes in ImageNet-C. On Raspberry Pi and Jetson Orin Nano devices, NEO reduces inference time by 63% and memory usage by 9% compared to baselines. Our results based on 3 ViT architectures and 4 datasets show that NEO can be used efficiently and effectively for TTA.
Problem

Research questions and friction points this paper is trying to address.

Improves test-time adaptation without computational overhead
Enhances model accuracy under distribution shifts efficiently
Reduces inference time and memory usage on devices
Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent re-centering aligns source and target distributions
Hyperparameter-free adaptation with minimal computational overhead
Achieves accuracy gains using single batch adaptation
🔎 Similar Papers
No similar papers found.
A
Alexander Murphy
University of Birmingham
M
Michal Danilowski
University of Birmingham
Soumyajit Chatterjee
Soumyajit Chatterjee
Senior Research Scientist, Bell Labs and Visiting Scholar, University of Cambridge
Pervasive ComputingApplied Machine Learning
A
Abhirup Ghosh
University of Cambridge