Unleashing the Potential of All Test Samples: Mean-Shift Guided Test-Time Adaptation

📅 2025-07-01

📈 Citations: 0

✨ Influential: 0

career value

166K/year

🤖 AI Summary

Vision-language models (e.g., CLIP) suffer from degraded generalization under distribution shifts. Existing training-free test-time adaptation (TTA) methods operate solely in the original feature space and rely only on high-confidence samples. This work proposes a novel training-free TTA framework that escapes CLIP’s native feature space: it refines features of *all* test samples—including low-confidence ones—via a single-step k-nearest-neighbor mean-shift operation, and introduces a caching mechanism to reuse refined embeddings, enhancing both inference efficiency and inter-class separability. Its core innovation lies in mean-shift-guided, cross-sample feature collaboration, jointly optimizing robustness and compactness. Evaluated across multiple out-of-distribution and cross-dataset benchmarks, our method consistently outperforms state-of-the-art training-free TTA approaches, delivering stable performance gains with minimal computational overhead.

Technology Category

Application Category

📝 Abstract

Visual-language models (VLMs) like CLIP exhibit strong generalization but struggle with distribution shifts at test time. Existing training-free test-time adaptation (TTA) methods operate strictly within CLIP's original feature space, relying on high-confidence samples while overlooking the potential of low-confidence ones. We propose MS-TTA, a training-free approach that enhances feature representations beyond CLIP's space using a single-step k-nearest neighbors (kNN) Mean-Shift. By refining all test samples, MS-TTA improves feature compactness and class separability, leading to more stable adaptation. Additionally, a cache of refined embeddings further enhances inference by providing Mean Shift enhanced logits. Extensive evaluations on OOD and cross-dataset benchmarks demonstrate that MS-TTA consistently outperforms state-of-the-art training-free TTA methods, achieving robust adaptation without requiring additional training.

Problem

Research questions and friction points this paper is trying to address.

Improving CLIP's performance under test-time distribution shifts

Enhancing feature representation for all test samples

Achieving robust adaptation without additional training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mean-Shift guided kNN for test-time adaptation

Refines all test samples for better features

Cache-enhanced embeddings improve inference stability

🔎 Similar Papers

Hybrid-TTA: Continual Test-time Adaptation via Dynamic Domain Shift Detection