🤖 AI Summary
Federated learning (FL) preserves data privacy and reduces communication overhead by training models on edge devices; however, its large-scale deployment incurs substantial carbon emissions due to high energy consumption and pronounced spatiotemporal heterogeneity in grid carbon intensity—varying up to 60× across regions. Existing FL optimization approaches focus on time-accuracy trade-offs or energy efficiency, neglecting carbon intensity variability and thus failing to achieve carbon-efficient training.
Method: We propose the first end-to-end carbon-aware FL framework, integrating carbon-aware client selection, dynamic over-provisioning, and carbon-sensitive straggler mitigation. Our approach jointly leverages geolocated carbon intensity data, client-specific data utility estimation, and dynamic resource orchestration, implemented as a scheduling policy within the Flower framework.
Contribution/Results: Experiments demonstrate up to 10.8× reduction in carbon emissions versus state-of-the-art methods, while maintaining ≤1% deviation in model accuracy and training time.
📝 Abstract
Federated Learning (FL) distributes machine learning (ML) training across edge devices to reduce data transfer overhead and protect data privacy. Since FL model training may span hundreds of devices and is thus resource- and energy-intensive, it has a significant carbon footprint. Importantly, since energy's carbon-intensity differs substantially (by up to 60$ imes$) across locations, training on the same device using the same amount of energy, but at different locations, can incur widely different carbon emissions. While prior work has focused on improving FL's resource- and energy-efficiency by optimizing time-to-accuracy, it implicitly assumes all energy has the same carbon intensity and thus does not optimize carbon efficiency, i.e., work done per unit of carbon emitted. To address the problem, we design EcoLearn, which minimizes FL's carbon footprint without significantly affecting model accuracy or training time. EcoLearn achieves a favorable tradeoff by integrating carbon awareness into multiple aspects of FL training, including i) selecting clients with high data utility and low carbon, ii) provisioning more clients during the initial training rounds, and iii) mitigating stragglers by dynamically adjusting client over-provisioning based on carbon. We implement EcoLearn and its carbon-aware FL training policies in the Flower framework and show that it reduces the carbon footprint of training (by up to $10.8$$ imes$) while maintaining model accuracy and training time (within $sim$$1$%) compared to state-of-the-art approaches.