🤖 AI Summary
Addressing the dual challenges of Byzantine resilience and high communication overhead in heterogeneous federated learning, this paper proposes CyBeR-0, a zeroth-order optimization framework. CyBeR-0 eliminates gradient transmission: clients upload only a few scalars per round, drastically reducing both uplink/downlink communication and memory overhead. It introduces a novel transformation-based robust aggregation mechanism, providing theoretical convergence guarantees under non-convex objectives, statistical heterogeneity, and Byzantine failures. Notably, CyBeR-0 is the first zeroth-order method to enable stable, efficient fine-tuning of large language models (LLMs) in federated settings. Extensive experiments demonstrate that CyBeR-0 maintains high accuracy on standard benchmarks and LLM fine-tuning tasks, while reducing communication costs by one to two orders of magnitude and significantly lowering memory footprint—achieving strong Byzantine robustness without sacrificing practicality.
📝 Abstract
We introduce CyBeR-0, a Byzantine-resilient federated zero-order optimization method that is robust under Byzantine attacks and provides significant savings in uplink and downlink communication costs. We introduce transformed robust aggregation to give convergence guarantees for general non-convex objectives under client data heterogeneity. Empirical evaluations for standard learning tasks and fine-tuning large language models show that CyBeR-0 exhibits stable performance with only a few scalars per-round communication cost and reduced memory requirements.