🤖 AI Summary
This work addresses the challenge of efficiently predicting power consumption for AI workloads on GPUs, where existing models rely on simulation or hardware profiling to estimate utilization—approaches that are often impractical for rapid design exploration. The paper proposes a lightweight methodology that leverages the structured execution and memory access patterns emerging from AI kernel optimizations. By integrating analytical modeling with empirical fitting, the approach accurately predicts module-level hardware utilization without requiring simulation or physical measurements, thereby enabling dynamic power estimation. Evaluated on NVIDIA Ampere GPUs, the model achieves an average power prediction error of 8% and generalizes effectively to the H100 architecture with only 7% error. This enables second-scale, scalable power exploration and facilitates optimization of clock frequency and architectural configurations.
📝 Abstract
As AI workloads drive increases in datacenter power consumption, accurate GPU power estimation is critical for proactive power management. However, existing power models face a scalability bottleneck not in the modeling techniques themselves, but in obtaining the hardware utilization inputs they require. Conventional approaches rely on either costly simulation or hardware profiling, which makes them impractical when rapid predictions are required.
This work presents EnergAIzer, which addresses this scalability bottleneck by developing a lightweight solution to predict utilization inputs, reducing the estimation walltime from hours to seconds. Our key insight is that kernels in AI workloads commonly employ optimizations that create structured patterns, which analytically determine memory traffic and execution timeline. We construct a performance model using these patterns as an analytical scaffold for empirical data fitting, which also naturally exposes module-level utilization. This predicted utilization is then fed into our power model to estimate dynamic power consumption.
EnergAIzer achieves 8% power errors on NVIDIA Ampere GPUs, competitive with traditional power models with elaborate cycle-level simulation or hardware profiling. We demonstrate EnergAIzer's exploration capabilities for frequency scaling and architectural configurations, including forecasting the power of NVIDIA H100 with just 7% error. In summary, EnergAIzer provides fast and accurate power prediction for AI workloads, paving the way for power-aware design explorations.