π€ AI Summary
To address the challenge of personalized federated learning (PFL) under client-side label scarcity, this paper proposes FLowDUPβthe first PFL framework designed for unlabeled clients. Methodologically, it employs low-dimensional subspace modeling to generate personalized models via a single forward pass; introduces a transductive multi-task learning mechanism enabling collaborative training across both labeled and unlabeled clients; and establishes, for the first time, a transductive multi-task PAC-Bayesian generalization bound, providing theoretical guarantees for unlabeled settings. Innovatively, a bidirectional contribution mechanism is designed to enhance interaction efficiency between global and local models. Experiments demonstrate that FLowDUP significantly outperforms state-of-the-art baselines across diverse statistically heterogeneous datasets. Ablation studies confirm the effectiveness of each component, while communication and computational overhead are substantially reduced.
π Abstract
Personalized federated learning has emerged as a popular approach to training on devices holding statistically heterogeneous data, known as clients. However, most existing approaches require a client to have labeled data for training or finetuning in order to obtain their own personalized model. In this paper we address this by proposing FLowDUP, a novel method that is able to generate a personalized model using only a forward pass with unlabeled data. The generated model parameters reside in a low-dimensional subspace, enabling efficient communication and computation. FLowDUP's learning objective is theoretically motivated by our new transductive multi-task PAC-Bayesian generalization bound, that provides performance guarantees for unlabeled clients. The objective is structured in such a way that it allows both clients with labeled data and clients with only unlabeled data to contribute to the training process. To supplement our theoretical results we carry out a thorough experimental evaluation of FLowDUP, demonstrating strong empirical performance on a range of datasets with differing sorts of statistically heterogeneous clients. Through numerous ablation studies, we test the efficacy of the individual components of the method.