Prompt Estimation from Prototypes for Federated Prompt Tuning of Vision Transformers

📅 2025-10-29

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

In federated learning, vision transformer (ViT) prompt tuning faces a fundamental trade-off between global generalizability and client-level personalization. To address this, we propose PEP-FedPT—a unified framework for personalized and efficient federated prompt tuning. Our core innovations are: (1) a sample-level adaptive prompt generator leveraging class-prototype modeling and client-wise class-prior weighting; and (2) a class-contextualized hybrid prompting mechanism that dynamically fuses globally shared prompts with class-specific prompts—enabling per-sample personalization without learnable local parameters. Extensive experiments on heterogeneous benchmarks—including CIFAR-100 and TinyImageNet—demonstrate that PEP-FedPT significantly outperforms existing federated prompt tuning methods. It achieves superior generalization across clients while enhancing local adaptation, thereby effectively mitigating the challenges posed by non-IID data distributions.

Technology Category

Application Category

📝 Abstract

Visual Prompt Tuning (VPT) of pre-trained Vision Transformers (ViTs) has proven highly effective as a parameter-efficient fine-tuning technique for adapting large models to downstream tasks with limited data. Its parameter efficiency makes it particularly suitable for Federated Learning (FL), where both communication and computation budgets are often constrained. However, global prompt tuning struggles to generalize across heterogeneous clients, while personalized tuning overfits to local data and lacks generalization. We propose PEP-FedPT (Prompt Estimation from Prototypes for Federated Prompt Tuning), a unified framework designed to achieve both generalization and personalization in federated prompt tuning of ViTs. Within this framework, we introduce the novel Class-Contextualized Mixed Prompt (CCMP) - based on class-specific prompts maintained alongside a globally shared prompt. For each input, CCMP adaptively combines class-specific prompts using weights derived from global class prototypes and client class priors. This approach enables per-sample prompt personalization without storing client-dependent trainable parameters. The prompts are collaboratively optimized via traditional federated averaging technique on the same. Comprehensive evaluations on CIFAR-100, TinyImageNet, DomainNet, and iNaturalist datasets demonstrate that PEP-FedPT consistently surpasses the state-of-the-art baselines under diverse data heterogeneity scenarios, establishing a strong foundation for efficient and generalizable federated prompt tuning of Vision Transformers.

Problem

Research questions and friction points this paper is trying to address.

Addresses poor generalization across heterogeneous clients in federated prompt tuning

Mitigates overfitting to local data in personalized federated learning

Achieves both generalization and personalization without client-dependent parameters

Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated prompt tuning with class-contextualized mixed prompts

Adaptive prompt combination using global prototypes and client priors

Personalized per-sample prompts without client-dependent trainable parameters

🔎 Similar Papers

No similar papers found.