Combining Performance and Productivity: Accelerating the Network Sensing Graph Challenge with GPUs and Commodity Data Science Software

📅 2025-09-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the HPEC Anonymous Network Sensing Graph Challenge, this paper proposes a GPU-accelerated, end-to-end graph analytics framework grounded in the data science paradigm—eliminating the need for domain-specific HPC code. Methodologically, it innovatively maps GraphBLAS abstractions onto data science primitives and leverages open-source libraries—including RAPIDS cuDF and CuPy—to jointly optimize I/O, graph representation, data preprocessing, and network analysis on NVIDIA A100, H100, and H200 GPUs. The core contribution is the co-optimization of high performance and high developer productivity: relative to CPU-based Pandas implementations, speedups reach 147×–509× on A100, 243×–1,269× on H100, and 332×–2,185× on H200. These results substantially broaden the applicability of general-purpose data science tools to large-scale graph analytics.

Technology Category

Application Category

📝 Abstract
The HPEC Graph Challenge is a collection of benchmarks representing complex workloads that test the hardware and software components of HPC systems, which traditional benchmarks, such as LINPACK, do not. The first benchmark, Subgraph Isomorphism, focused on several compute-bound and memory-bound kernels. The most recent of the challenges, the Anonymized Network Sensing Graph Challenge, represents a shift in direction, as it represents a longer end-to-end workload that requires many more software components, including, but not limited to, data I/O, data structures for representing graph data, and a wide range of functions for data preparation and network analysis. A notable feature of this new graph challenge is the use of GraphBLAS to represent the computational aspects of the problem statement. In this paper, we show an alternative interpretation of the GraphBLAS formulations using the language of data science. With this formulation, we show that the new graph challenge can be implemented using off-the-shelf ETL tools available in open-source, enterprise software such as NVIDIA's RAPIDS ecosystem. Using off-the-shelf software, RAPIDS cuDF and cupy, we enable significant software acceleration without requiring any specific HPC code and show speedups, over the same code running with Pandas on the CPU, of 147x-509x on an NVIDIA A100 GPU, 243x-1269X for an NVIDIA H100 GPU, and 332X-2185X for an NVIDIA H200 GPU.
Problem

Research questions and friction points this paper is trying to address.

Accelerating network sensing graph challenge with GPUs
Implementing GraphBLAS using data science language
Enabling software acceleration with off-the-shelf tools
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using NVIDIA RAPIDS ecosystem for acceleration
Employing GraphBLAS formulations via data science
Leveraging off-the-shelf ETL tools for implementation
🔎 Similar Papers
No similar papers found.