Latte: Collaborative Test-Time Adaptation of Vision-Language Models in Federated Learning

📅 2025-07-29

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

To address the dual challenges of **client-side data scarcity** and **distribution heterogeneity** in test-time adaptation of vision-language models within federated learning, this paper proposes the **Collaborative Localized Adaptation Framework (CLAF)**. CLAF introduces a dual-memory mechanism: clients maintain private local memories, while the server orchestrates cross-client prototype retrieval and sharing—balancing privacy preservation with collaborative gains. It innovatively integrates embedding-similarity-guided local adaptation, uncertainty-aware prototype selection, and class-prototype exchange. Theoretically, CLAF is proven robust to both in-distribution and out-of-distribution clients. Extensive experiments on multiple domain adaptation and data corruption benchmarks demonstrate consistent superiority over state-of-the-art methods, achieving performance gains of 3.2–7.8% while reducing communication and computational overhead by over 40%.

Technology Category

Application Category

📝 Abstract

Test-time adaptation with pre-trained vision-language models has gained increasing attention for addressing distribution shifts during testing. Among these approaches, memory-based algorithms stand out due to their training-free nature and ability to leverage historical test data. However, existing test-time adaptation methods are typically designed for a single domain with abundant data. In decentralized settings such as federated learning, applying these methods individually to each client suffers from limited test data, while directly sharing a single global memory via the server prevents proper personalization to each client's unique distribution. To address this, we propose Latte, a novel framework where each client maintains a local memory to store embeddings from its own historical test data and an external memory to store class prototypes from other relevant clients. During communication, each client retrieves prototypes from similar clients under the server's coordination to expand its memory. For local adaptation, Latte utilizes both embedding similarity and uncertainty to enhance model performance. Our theoretical analysis shows that Latte effectively leverages in-distribution clients while remaining robust to out-of-distribution clients. Extensive experiments on domain adaptation and corruption benchmarks validate that Latte achieves superior performance in decentralized settings, while introducing only negligible communication and computation costs. Our code is available at https://github.com/baowenxuan/Latte .

Problem

Research questions and friction points this paper is trying to address.

Addresses distribution shifts in federated learning test-time adaptation

Overcomes limited test data in decentralized client settings

Balances personalization and collaboration via local and external memories

Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated learning with local and external memories

Retrieves prototypes from similar clients adaptively

Uses embedding similarity and uncertainty for adaptation

🔎 Similar Papers

Federated Large Language Models: Current Progress and Future Directions