Optimal Transceiver Design in Over-the-Air Federated Distillation

📅 2025-07-21

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the high communication overhead in federated learning (FL) for large-scale AI models, this paper proposes AirFLD—an over-the-air federated distillation framework. Leveraging the analog-domain superposition property of wireless channels, AirFLD directly aggregates distilled knowledge (i.e., soft labels) from edge devices instead of transmitting model parameters, thereby eliminating explicit parameter exchange. It is the first work to tightly integrate federated distillation with over-the-air computation (AirComp), deriving a closed-form convergence rate expression. We prove that semidefinite relaxation (SDR) for receive beamforming incurs no optimality loss and jointly optimize transmit power allocation and receive beamforming under per-device power constraints. Experiments demonstrate that AirFLD significantly reduces communication load while sustaining only marginal test accuracy degradation, outperforming conventional FL and baseline methods in both efficiency and accuracy.

Technology Category

Application Category

📝 Abstract

The rapid proliferation and growth of artificial intelligence (AI) has led to the development of federated learning (FL). FL allows wireless devices (WDs) to cooperatively learn by sharing only local model parameters, without needing to share the entire dataset. However, the emergence of large AI models has made existing FL approaches inefficient, due to the significant communication overhead required. In this paper, we propose a novel over-the-air federated distillation (FD) framework by synergizing the strength of FL and knowledge distillation to avoid the heavy local model transmission. Instead of sharing the model parameters, only the WDs' model outputs, referred to as knowledge, are shared and aggregated over-the-air by exploiting the superposition property of the multiple-access channel. We shall study the transceiver design in over-the-air FD, aiming to maximize the learning convergence rate while meeting the power constraints of the transceivers. The main challenge lies in the intractability of the learning performance analysis, as well as the non-convex nature and the optimization spanning the whole FD training period. To tackle this problem, we first derive an analytical expression of the convergence rate in over-the-air FD. Then, the closed-form optimal solutions of the WDs' transmit power and the estimator for over-the-air aggregation are obtained given the receiver combining strategy. Accordingly, we put forth an efficient approach to find the optimal receiver beamforming vector via semidefinite relaxation. We further prove that there is no optimality gap between the original and relaxed problem for the receiver beamforming design. Numerical results will show that the proposed over-the-air FD approach achieves a significant reduction in communication overhead, with only a minor compromise in testing accuracy compared to conventional FL benchmarks.

Problem

Research questions and friction points this paper is trying to address.

Reduces communication overhead in federated learning

Optimizes transceiver design for efficient knowledge distillation

Maximizes learning convergence under power constraints

Innovation

Methods, ideas, or system contributions that make the work stand out.

Over-the-air federated distillation reduces communication overhead

Optimal transceiver design maximizes learning convergence rate

Semidefinite relaxation solves receiver beamforming optimization

🔎 Similar Papers

No similar papers found.

Authors to Follow