🤖 AI Summary
To address the high computational/storage overhead and severe communication bottlenecks of large-scale models in second-order federated learning (FL), this paper proposes a distributed Newton-type optimization framework integrating sparse Hessian estimation with analog-domain over-the-air computation (OTA). It is the first work to deeply co-design sparse second-order curvature approximation and wireless-channel-adaptive analog aggregation, thereby circumventing conventional digital transmission constraints and substantially reducing edge-device resource consumption. Experiments demonstrate that the proposed method reduces communication resources and energy consumption by over 67% compared to state-of-the-art first- and second-order baseline FL approaches. Moreover, it enables training of larger-scale models and achieves significantly faster convergence. These improvements markedly enhance the feasibility and scalability of second-order FL on resource-constrained edge devices.
📝 Abstract
Second-order federated learning (FL) algorithms offer faster convergence than their first-order counterparts by leveraging curvature information. However, they are hindered by high computational and storage costs, particularly for large-scale models. Furthermore, the communication overhead associated with large models and digital transmission exacerbates these challenges, causing communication bottlenecks. In this work, we propose a scalable second-order FL algorithm using a sparse Hessian estimate and leveraging over-the-air aggregation, making it feasible for larger models. Our simulation results demonstrate more than $67%$ of communication resources and energy savings compared to other first and second-order baselines.