Reviving Stale Updates: Data-Free Knowledge Distillation for Asynchronous Federated Learning

📅 2025-11-01

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

In asynchronous federated learning (AFL), stale client updates induce model staleness, severely compromising convergence stability. To address this, we propose FedRevive—a novel framework that introduces data-free knowledge distillation (DFKD) at the server side for the first time. FedRevive employs meta-learning to synthesize pseudo-samples and enables multi-teacher collaborative distillation, while a parameter-space mixing aggregation mechanism corrects stale updates without accessing real client data. Its core innovation lies in a lightweight, scalable, server-side DFKD paradigm grounded in both theoretical rigor and engineering practicality. Extensive experiments across multiple vision and natural language processing benchmarks demonstrate that FedRevive achieves up to 32.1% faster training convergence and up to 21.5% higher final accuracy compared to state-of-the-art asynchronous FL methods.

Technology Category

Application Category

📝 Abstract

Federated Learning (FL) enables collaborative model training across distributed clients without sharing raw data, yet its scalability is limited by synchronization overhead. Asynchronous Federated Learning (AFL) alleviates this issue by allowing clients to communicate independently, thereby improving wall-clock efficiency in large-scale, heterogeneous environments. However, this asynchrony introduces stale updates (client updates computed on outdated global models) that can destabilize optimization and hinder convergence. We propose FedRevive, an asynchronous FL framework that revives stale updates through data-free knowledge distillation (DFKD). FedRevive integrates parameter-space aggregation with a lightweight, server-side DFKD process that transfers knowledge from stale client models to the current global model without access to real or public data. A meta-learned generator synthesizes pseudo-samples, which enables multi-teacher distillation. A hybrid aggregation scheme that combines raw updates with DFKD updates effectively mitigates staleness while retaining the scalability of AFL. Experiments on various vision and text benchmarks show that FedRevive achieves faster training up to 32.1% and higher final accuracy up to 21.5% compared to asynchronous baselines.

Problem

Research questions and friction points this paper is trying to address.

Mitigating stale updates in asynchronous federated learning

Reviving outdated client models without real data

Improving convergence and efficiency in distributed training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses data-free knowledge distillation to revive stale updates

Employs meta-learned generator for pseudo-sample synthesis

Combines parameter aggregation with distillation in hybrid scheme

🔎 Similar Papers

No similar papers found.

Authors to Follow