🤖 AI Summary
This study addresses comparability and reproducibility challenges in energy-efficiency evaluation between synthetic benchmarks (e.g., HPL, STREAM) and real-world applications (e.g., GROMACS) on HPC systems. We systematically analyze power and performance measurement discrepancies across CPU (Intel Ice Lake/Sapphire Rapids) and GPU (NVIDIA A40/A100) platforms on the Fritz and Alex supercomputers. Using MPI-parallelized experiments, we perform fine-grained energy-efficiency monitoring via LIKWID (CPUs) and NVIDIA Nsight Tools (GPUs), uncovering critical experimental pitfalls—including instrumentation calibration errors, inadequate capture of load transient responses, and platform firmware heterogeneity. We propose a standardized measurement protocol for HPC energy-efficiency research, specifying best practices for sampling frequency, steady-state detection, and hardware counter alignment. The work delivers a cross-architecture, reproducible energy-efficiency benchmark dataset and identifies three primary factors undermining assessment consistency—establishing a methodological foundation for green HPC evaluation frameworks.
📝 Abstract
This paper discusses the challenges encountered when analyzing the energy efficiency of synthetic benchmarks and the Gromacs package on the Fritz and Alex HPC clusters. Experiments were conducted using MPI parallelism on full sockets of Intel Ice Lake and Sapphire Rapids CPUs, as well as Nvidia A40 and A100 GPUs. The metrics and measurements obtained with the Likwid and Nvidia profiling tools are presented, along with the results. The challenges and pitfalls encountered during experimentation and analysis are revealed and discussed. Best practices for future energy efficiency analysis studies are suggested.