WarpGuard: Protected-Site Control-Flow Integrity for CUDA SASS Binaries

📅 2026-06-10

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing CUDA security mechanisms are insufficient to defend against control-flow hijacking attacks at the SASS binary level caused by GPU memory vulnerabilities. This work presents the first fine-grained control-flow integrity (CFI) enforcement at the NVIDIA SASS execution layer. By combining static analysis with dynamic instrumentation, the approach performs runtime validation of return addresses and forward jump targets, employing a fail-secure strategy to immediately terminate execution upon violation detection. The method accurately accommodates complex calling patterns in real-world CUDA programs, identifying 51,621 control-flow points across 77 applications and performing over 52 million integrity checks during execution. Experimental results demonstrate its effectiveness in thwarting representative control-flow hijacking attacks, confirming both practicality and robustness.

📝 Abstract

Recent CUDA exploitation work shows that GPU memory bugs can escalate into device-side control-flow corruption, as kernels later consume corrupted return continuations, function pointers, dispatch-table entries, or branch targets. For deployed CUDA binaries, the relevant security boundary is executed NVIDIA SASS, after PTX lowering, inlining, ABI decisions, register allocation, spills, predication, and SIMT execution; source- or PTX-level policies do not capture this boundary. We present WarpGuard, to our knowledge the first protected-site CFI system for CUDA device binaries operating on executed SASS. WarpGuard enforces at protected sites: recovered SASS instructions or sequences that consume control-flow state, provide sufficient binary evidence to derive policy, are checked before release, and fail closed on violation. It authenticates backward-edge continuation state for instrumented returns, validates recoverable forward targets per site, and reports fixed-edge, unsupported, profile-excluded, fallback, and no-surface outcomes outside the protected denominator. On 77 CUDA artifacts, WarpGuard classifies 51,621 SASS control-flow sites, including 1,343 returns and 154 supported forward target-set entries, and records 52.2 million dynamic checks. In representative backward- and forward-edge corruption attacks, native execution reaches attacker-selected behavior, detect-only mode records the expected violation, and enforcement fails closed before releasing the invalid protected transfer. Public-code evidence shows that the same SASS consumption patterns occur in real CUDA systems, including runtime dispatch tables, cuFFT callbacks, generated callable tables, and uploaded device-function pointers. WarpGuard delivers auditable protected-site CFI for CUDA SASS and separates dynamic-instrumentation enforcement from callback-free SASS timing and patch-cache feasibility.

Problem

Research questions and friction points this paper is trying to address.

Control-Flow Integrity

CUDA

SASS

GPU Security

Binary Protection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Control-Flow Integrity

CUDA SASS

Protected-Site CFI