🤖 AI Summary
To address the HPEC Anonymous Network Sensing Graph Challenge, this paper proposes a GPU-accelerated, end-to-end graph analytics framework grounded in the data science paradigm—eliminating the need for domain-specific HPC code. Methodologically, it innovatively maps GraphBLAS abstractions onto data science primitives and leverages open-source libraries—including RAPIDS cuDF and CuPy—to jointly optimize I/O, graph representation, data preprocessing, and network analysis on NVIDIA A100, H100, and H200 GPUs. The core contribution is the co-optimization of high performance and high developer productivity: relative to CPU-based Pandas implementations, speedups reach 147×–509× on A100, 243×–1,269× on H100, and 332×–2,185× on H200. These results substantially broaden the applicability of general-purpose data science tools to large-scale graph analytics.
📝 Abstract
The HPEC Graph Challenge is a collection of benchmarks representing complex workloads that test the hardware and software components of HPC systems, which traditional benchmarks, such as LINPACK, do not. The first benchmark, Subgraph Isomorphism, focused on several compute-bound and memory-bound kernels. The most recent of the challenges, the Anonymized Network Sensing Graph Challenge, represents a shift in direction, as it represents a longer end-to-end workload that requires many more software components, including, but not limited to, data I/O, data structures for representing graph data, and a wide range of functions for data preparation and network analysis. A notable feature of this new graph challenge is the use of GraphBLAS to represent the computational aspects of the problem statement. In this paper, we show an alternative interpretation of the GraphBLAS formulations using the language of data science. With this formulation, we show that the new graph challenge can be implemented using off-the-shelf ETL tools available in open-source, enterprise software such as NVIDIA's RAPIDS ecosystem. Using off-the-shelf software, RAPIDS cuDF and cupy, we enable significant software acceleration without requiring any specific HPC code and show speedups, over the same code running with Pandas on the CPU, of 147x-509x on an NVIDIA A100 GPU, 243x-1269X for an NVIDIA H100 GPU, and 332X-2185X for an NVIDIA H200 GPU.