Hunting CUDA Bugs at Scale with cuFuzz

📅 2026-03-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of detecting memory safety and concurrency bugs in GPU programs, which are exacerbated by heterogeneous execution models and complex software stacks, rendering traditional fuzzing techniques ineffective. To this end, we propose cuFuzz, the first full-program CUDA fuzzing framework that innovatively integrates coverage feedback from both host and device code. By decoupling sanitizer execution from coverage collection, cuFuzz significantly improves efficiency. The framework leverages NVBit for device-side instruction instrumentation and compiler-based instrumentation for host-side coverage, employing a multi-process architecture with persistent mode to enhance throughput. Evaluated on 14 real-world CUDA applications, cuFuzz uncovered 43 previously unknown vulnerabilities—including 19 in commercial libraries—spanning illegal memory accesses, uninitialized reads, and data races, substantially outperforming existing approaches.

Technology Category

Application Category

📝 Abstract
GPUs play an increasingly important role in modern software. However, the heterogeneous host-device execution model and expanding software stacks make GPU programs prone to memory-safety and concurrency bugs that evade static analysis. While fuzz-testing, combined with dynamic error checking tools, offers a plausible solution, it remains underutilized for GPUs. In this work, we identify three main obstacles limiting prior GPU fuzzing efforts: (1) kernel-level fuzzing leading to false positives, (2) lack of device-side coverage-guided feedback, and (3) incompatibility between coverage and sanitization tools. We present cuFuzz, the first CUDA-oriented fuzzer that makes GPU fuzzing practical by addressing these obstacles. cuFuzz uses whole program fuzzing to avoid false positives from independently fuzzing device-side kernels. It leverages NVBit to instrument device-side instructions and merges the resultant coverage with compiler-based host coverage. Finally, cuFuzz decouples sanitization from coverage collection by executing host- and device-side sanitizers in separate processes. cuFuzz uncovers 43 previously unknown bugs (19 in commercial libraries) across 14 CUDA programs, including illegal memory accesses, uninitialized reads, and data races. cuFuzz achieves significantly more discovered edges and unique inputs compared to baseline approaches, especially on closed-source targets. Moreover, we quantify the execution time overheads of the different cuFuzz components and add persistent-mode support to improve the overall fuzzing throughput. Our results demonstrate that cuFuzz is an effective and deployable addition to the GPU testing toolbox. cuFuzz is publicly available at https://github.com/NVlabs/cuFuzz/.
Problem

Research questions and friction points this paper is trying to address.

GPU fuzzing
CUDA bugs
memory safety
concurrency bugs
coverage-guided feedback
Innovation

Methods, ideas, or system contributions that make the work stand out.

GPU fuzzing
CUDA
coverage-guided fuzzing
memory safety
dynamic sanitization
🔎 Similar Papers
No similar papers found.