🤖 AI Summary
Hardware-level power monitoring (e.g., Intel RAPL) suffers from platform dependency and coarse-grained domain-level resolution, hindering fine-grained per-process energy-efficiency analysis. To address this, we propose a hardware-agnostic modeling framework that jointly leverages eBPF and perf to collect fine-grained per-process resource metrics (CPU, memory, I/O, etc.) and integrates node-level power measurements from PDUs. A lightweight regression model is then trained to predict per-process energy consumption with high accuracy. This work presents the first cross-platform, eBPF-driven process–power association model, overcoming the hardware-specificity and granularity limitations of conventional tools. Experimental evaluation demonstrates an average prediction error below 8.3%, significantly enhancing both the precision and interpretability of energy-aware management in data centers.
📝 Abstract
The growing demand for data center capacity, driven by the growth of high-performance computing, cloud computing, and especially artificial intelligence, has led to a sharp increase in data center energy consumption. To improve energy efficiency, gaining process-level insights into energy consumption is essential. While node-level energy consumption data can be directly measured with hardware such as power meters, existing mechanisms for estimating per-process energy usage, such as Intel RAPL, are limited to specific hardware and provide only coarse-grained, domain-level measurements. Our proposed approach models per-process energy profiles by leveraging fine-grained process-level resource metrics collected via eBPF and perf, which are synchronized with node-level energy measurements obtained from an attached power distribution unit. By statistically learning the relationship between process-level resource usage and node-level energy consumption through a regression-based model, our approach enables more fine-grained per-process energy predictions.