Design of a GPU with Heterogeneous Cores for Graphics

📅 2026-01-30

📈 Citations: 0

✨ Influential: 0

career value

242K/year

🤖 AI Summary

This work addresses the significant disparity in computational and bandwidth demands across different regions of graphics workloads, which conventional homogeneous GPUs struggle to handle efficiently. To overcome this limitation, the authors propose KHEPRI, a novel GPU architecture that introduces heterogeneous cores into the graphics pipeline for the first time, integrating compute-optimized and memory-optimized processing units. A zero-overhead dynamic scheduler, leveraging inter-frame coherence, adaptively assigns tiles to the most suitable core type based on their characteristics. This approach achieves both load balancing and data locality without additional hardware overhead. Experimental results demonstrate that KHEPRI improves average performance by 9.2%, increases frame rate by 7.3%, and reduces total energy consumption by 4.8% compared to traditional homogeneous GPUs.

Technology Category

Application Category

📝 Abstract

Heterogeneous architectures can deliver higher performance and energy efficiency than symmetric counterparts by using multiple architectures tuned to different types of workloads. While previous works focused on CPUs, this work extends the concept of heterogeneity to GPUs by proposing KHEPRI, a heterogeneous GPU architecture for graphics applications. Scenes in graphics applications showcase diversity, as they consist of many objects with varying levels of complexity. As a result, computational intensity and memory bandwidth requirements differ significantly across different regions of each scene. To address this variability, our proposal includes two types of cores: cores optimized for high ILP (compute-specialized) and cores that tolerate a higher number of simultaneously outstanding cache misses (memory-specialized). A key component of the proposed architecture is a novel work scheduler that dynamically assigns each part of a frame (i.e., a tile) to the most suitable core. Designing this scheduler is particularly challenging, as it must preserve data locality; otherwise, the benefits of heterogeneity may be offset by the penalty of additional cache misses. Additionally, the scheduler requires knowledge of each tile's characteristics before rendering it. For this purpose, KHEPRI leverages frame-to-frame coherence to predict the behavior of each tile based on that of the corresponding tile in the previous frame. Evaluations across a wide range of commercial animated graphics applications show that, compared to a traditional homogeneous GPU, KHEPRI achieves an average performance improvement of 9.2%, a throughput increase (frames per second) of 7.3%, and a total GPU energy reduction of 4.8%. Importantly, these benefits are achieved without any hardware overhead.

Problem

Research questions and friction points this paper is trying to address.

heterogeneous GPU

graphics applications

workload diversity

compute-memory imbalance

scene complexity

Innovation

Methods, ideas, or system contributions that make the work stand out.

heterogeneous GPU

compute-specialized core

memory-specialized core