BIPPO: Budget-Aware Independent PPO for Energy-Efficient Federated Learning Services

📅 2025-11-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Resource-constrained federated learning (FL) in IoT suffers from high energy consumption, poor generalization, and client selection strategies that inadequately account for budget constraints and device dynamism. Method: This paper proposes a budget-aware multi-agent reinforcement learning (MARL) framework built upon an improved Independent Proximal Policy Optimization (IPPO) algorithm. It jointly models dynamic resource constraints and incorporates an efficient sampling strategy tailored to non-IID data, enabling energy-efficient and stable client selection. Contribution/Results: The framework introduces an explicit budget-aware mechanism and a lightweight independent agent architecture—eliminating centralized training overhead—while supporting heterogeneous devices’ dynamic join/leave. Experiments under stringent energy budgets demonstrate significant improvements in average model accuracy, minimal energy consumption (negligible energy share), and near-constant computational and communication overhead regardless of client scale—achieving high scalability and environmental adaptability.

Technology Category

Application Category

📝 Abstract
Federated Learning (FL) is a promising machine learning solution in large-scale IoT systems, guaranteeing load distribution and privacy. However, FL does not natively consider infrastructure efficiency, a critical concern for systems operating in resource-constrained environments. Several Reinforcement Learning (RL) based solutions offer improved client selection for FL; however, they do not consider infrastructure challenges, such as resource limitations and device churn. Furthermore, the training of RL methods is often not designed for practical application, as these approaches frequently do not consider generalizability and are not optimized for energy efficiency. To fill this gap, we propose BIPPO (Budget-aware Independent Proximal Policy Optimization), which is an energy-efficient multi-agent RL solution that improves performance. We evaluate BIPPO on two image classification tasks run in a highly budget-constrained setting, with FL clients training on non-IID data, a challenging context for vanilla FL. The improved sampler of BIPPO enables it to increase the mean accuracy compared to non-RL mechanisms, traditional PPO, and IPPO. In addition, BIPPO only consumes a negligible proportion of the budget, which stays consistent even if the number of clients increases. Overall, BIPPO delivers a performant, stable, scalable, and sustainable solution for client selection in IoT-FL.
Problem

Research questions and friction points this paper is trying to address.

FL lacks infrastructure efficiency in resource-limited IoT systems
Existing RL methods ignore generalizability and energy efficiency
Client selection struggles with budget constraints and device churn
Innovation

Methods, ideas, or system contributions that make the work stand out.

Budget-aware Independent PPO for energy efficiency
Multi-agent RL solution for client selection
Improved sampler for accuracy in constrained settings
🔎 Similar Papers
No similar papers found.
A
Anna Lackinger
Distributed Systems Group, TU Wien, Vienna, Austria
Andrea Morichetta
Andrea Morichetta
University Assistant (Postdoc), TU Wien (Distributed System Group)
Deep LearningComputing continuumUnsupervised LearningInterpretabilitySecurity
P
P. Frangoudis
Distributed Systems Group, TU Wien, Vienna, Austria
S
S. Dustdar
Distributed Systems Group, TU Wien, Vienna, Austria