Multi-Modal Style Transfer-based Prompt Tuning for Efficient Federated Domain Generalization

📅 2026-01-09
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes FaST-PT, a novel framework for federated domain generalization that addresses domain shift caused by cross-client data heterogeneity and the associated high communication and computational overhead. The approach introduces lightweight multimodal style transfer guided by textual supervision to enhance local feature representations and incorporates a dual-prompt architecture that disentangles global and domain-specific knowledge. Coupled with a sample-adaptive, domain-aware prompt generation mechanism, FaST-PT effectively reduces communication costs while improving generalization to unseen domains. Extensive experiments on four benchmarks—including PACS and DomainNet—demonstrate significant performance gains over state-of-the-art methods such as FedDG-GA and DiPrompt. Ablation studies further confirm the framework’s effectiveness and efficiency.

Technology Category

Application Category

📝 Abstract
Federated Domain Generalization (FDG) aims to collaboratively train a global model across distributed clients that can generalize well on unseen domains. However, existing FDG methods typically struggle with cross-client data heterogeneity and incur significant communication and computation overhead. To address these challenges, this paper presents a new FDG framework, dubbed FaST-PT, which facilitates local feature augmentation and efficient unseen domain adaptation in a distributed manner. First, we propose a lightweight Multi-Modal Style Transfer (MST) method to transform image embedding under text supervision, which could expand the training data distribution and mitigate domain shift. We then design a dual-prompt module that decomposes the prompt into global and domain prompts. Specifically, global prompts capture general knowledge from augmented embedding across clients, while domain prompts capture domain-specific knowledge from local data. Besides, Domain-aware Prompt Generation (DPG) is introduced to adaptively generate suitable prompts for each sample, which facilitates unseen domain adaptation through knowledge fusion. Extensive experiments on four cross-domain benchmark datasets, e.g., PACS and DomainNet, demonstrate the superior performance of FaST-PT over SOTA FDG methods such as FedDG-GA and DiPrompt. Ablation studies further validate the effectiveness and efficiency of FaST-PT.
Problem

Research questions and friction points this paper is trying to address.

Federated Domain Generalization
data heterogeneity
unseen domain generalization
communication overhead
computation overhead
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Modal Style Transfer
Prompt Tuning
Federated Domain Generalization
Domain-aware Prompt Generation
Dual-prompt Module
🔎 Similar Papers
No similar papers found.
Yuliang Chen
Yuliang Chen
University of California, San Diego
Self-Supervised LearningMultimodal Learning
X
Xi Lin
School of Computer Science, Shanghai Jiao Tong University
J
Jun Wu
School of Computer Science, Shanghai Jiao Tong University
Xiangrui Cai
Xiangrui Cai
Nankai University
Healthcare AITime Series AnalysisAI Safety
Q
Qiaolun Zhang
Department of Electronics, Information and Bioengineering, Polytechnic Institute of Milan
X
Xichun Fan
New York University Shanghai
J
Jiapeng Xu
School of Computer Science, Shanghai Jiao Tong University
X
Xiu Su
Big Data Institute, Central South University