π€ AI Summary
To address the challenges of data scarcity, poor generalization, and privacy constraints hindering efficient training of autonomous HVAC systems in real-world buildings, this paper pioneers the integration of federated learning into HVAC reinforcement control. We propose a lightweight, distributed training framework for multi-building collaboration, integrating Proximal Policy Optimization (PPO), LSTM-based state modeling, and an edge computing architecture. The framework features an adaptive model aggregation mechanism and a differential privacy protection strategy, ensuring data locality and privacy compliance while enhancing convergence speed and communication efficiency. Extensive experiments across six real-world building datasets demonstrate that our approach reduces energy consumption by 12.3% compared to baseline methods, decreases cross-building generalization error by 37%, and cuts communication overhead by 58%.