Temporal Test-Time Adaptation with State-Space Models

📅 2024-07-17

🏛️ Trans. Mach. Learn. Res.

📈 Citations: 2

✨ Influential: 0

career value

167K/year

🤖 AI Summary

This work addresses the challenge of natural distribution shift—gradually evolving over time—in model deployment, proposing a label-free online test-time adaptation (TTA) method. Unlike prevailing TTA approaches designed for synthetic corruptions, our method is the first to integrate stochastic state-space models (SSMs) into the TTA framework. By performing latent-variable inference, it explicitly models time-varying dynamics in feature representations, enabling unsupervised, dynamic class-prototype learning and adaptive classifier-head updating. Evaluated on realistic temporal distribution-shift benchmarks, our approach significantly outperforms existing TTA methods, particularly under small-batch inference and label-shift conditions, demonstrating superior robustness and performance gains. The method establishes a novel paradigm for open-world continual learning under non-stationary environments.

Technology Category

Application Category

📝 Abstract

Distribution shifts between training and test data are inevitable over the lifecycle of a deployed model, leading to performance decay. Adapting a model on test samples can help mitigate this drop in performance. However, most test-time adaptation methods have focused on synthetic corruption shifts, leaving a variety of distribution shifts underexplored. In this paper, we focus on distribution shifts that evolve gradually over time, which are common in the wild but challenging for existing methods, as we show. To address this, we propose STAD, a probabilistic state-space model that adapts a deployed model to temporal distribution shifts by learning the time-varying dynamics in the last set of hidden features. Without requiring labels, our model infers time-evolving class prototypes that act as a dynamic classification head. Through experiments on real-world temporal distribution shifts, we show that our method excels in handling small batch sizes and label shift.

Problem

Research questions and friction points this paper is trying to address.

Addresses performance decay from gradual temporal distribution shifts in deployed models

Adapts models to time-varying data dynamics without requiring labeled test samples

Handles real-world temporal shifts with small batch sizes and label distribution changes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian filtering adapts to temporal distribution shifts

Learns time-varying dynamics in hidden features

Infers time-evolving class prototypes without labels

🔎 Similar Papers

No similar papers found.