🤖 AI Summary
To address the failure of fine-tuning in federated learning (FL) caused by non-IID data and class imbalance, this paper proposes Probabilistic Prompt Tuning (PPT), a novel framework that freezes the global model and enables clients to collaboratively optimize lightweight, probabilistic input prefixes. Instead of conventional weight averaging, PPT introduces a first-of-its-kind probabilistic prompt aggregation mechanism. It is the first work to systematically integrate prompt tuning with FL, recasting training as a cooperative optimization problem over a shared prompt ensemble. Extensive experiments on multiple computer vision benchmarks demonstrate that PPT consistently outperforms baselines—including FedAvg and FedProx—achieving an average accuracy gain of 8.3% under extreme data skew while reducing communication overhead by 92%. The method exhibits strong generalization, robustness to heterogeneity, and computational efficiency.
📝 Abstract
Fine-tuning pre-trained models is a popular approach in machine learning for solving complex tasks with moderate data. However, fine-tuning the entire pre-trained model is ineffective in federated data scenarios where local data distributions are diversely skewed. To address this, we explore integrating federated learning with a more effective prompt-tuning method, optimizing for a small set of input prefixes to reprogram the pre-trained model's behavior. Our approach transforms federated learning into a distributed set modeling task, aggregating diverse sets of prompts to globally fine-tune the pre-trained model. We benchmark various baselines based on direct adaptations of existing federated model aggregation techniques and introduce a new probabilistic prompt aggregation method that substantially outperforms these baselines. Our reported results on a variety of computer vision datasets confirm that the proposed method is most effective to combat extreme data heterogeneity in federated learning.