Modular GPU Programming with Typed Perspectives

📅 2025-11-14

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

In GPU programming, tension between fine-grained per-thread control and coarse-grained collective operations (e.g., Tensor Core instructions) undermines modularity and safety: collective primitives require coordinated execution across thread groups, yet encapsulated functions are invoked by individual threads. Method: We introduce Prism, a new language featuring *typed views*—a novel type-system mechanism that explicitly classifies thread behavior by control granularity (per-thread, group-wide, or global), enabling safe, modular abstraction of collective operations. Built upon the Bundl core calculus, Prism’s type-safe compiler supports precise, hardware-aware abstractions for accelerators like Tensor Cores. Contribution/Results: Evaluation shows Prism delivers strong type safety with zero runtime overhead, significantly improving GPU kernel correctness, maintainability, and developer productivity—without sacrificing performance.

Technology Category

Application Category

📝 Abstract

To achieve peak performance on modern GPUs, one must balance two frames of mind: issuing instructions to individual threads to control their behavior, while simultaneously tracking the convergence of many threads acting in concert to perform collective operations like Tensor Core instructions. The tension between these two mindsets makes modular programming error prone. Functions that encapsulate collective operations, despite being called per-thread, must be executed cooperatively by groups of threads. In this work, we introduce Prism, a new GPU language that restores modularity while still giving programmers the low-level control over collective operations necessary for high performance. Our core idea is typed perspectives, which materialize, at the type level, the granularity at which the programmer is controlling the behavior of threads. We describe the design of Prism, implement a compiler for it, and lay its theoretical foundations in a core calculus called Bundl. We implement state-of-the-art GPU kernels in Prism and find that it offers programmers the safety guarantees needed to confidently write modular code without sacrificing performance.

Problem

Research questions and friction points this paper is trying to address.

Achieving peak GPU performance requires balancing individual thread control with collective operations

Modular GPU programming becomes error-prone due to tension between thread-level and collective operations

Providing type-safe abstractions for thread granularity control while maintaining high performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Typed perspectives track thread granularity at type level

Prism language enables modular GPU programming with control

Compiler implementation maintains performance while ensuring safety

🔎 Similar Papers

Taking GPU Programming Models to Task for Performance Portability