A comprehensive evaluation of spatial co-execution on GPUs using MPS and MIG technologies

📅 2026-04-24
📈 Citations: 0
Influential: 0
📄 PDF

career value

233K/year
🤖 AI Summary
This study addresses the low utilization of modern GPU computing resources by systematically evaluating the performance, energy efficiency, and resource isolation characteristics of NVIDIA’s Multi-Process Service (MPS) and Multi-Instance GPU (MIG) technologies under concurrent application workloads. The experiments reveal a critical trade-off between MPS’s scheduling flexibility and MIG’s hardware-level isolation: MPS can improve performance by up to 30% and reduce energy consumption by approximately 20% in the absence of memory contention, yet suffers a 30% performance degradation under contention; MIG effectively mitigates resource contention but is constrained by its rigid configuration options and higher overhead. These findings provide empirical foundations for optimizing GPU co-execution strategies driven by application-specific workload characteristics.

Technology Category

Application Category

📝 Abstract
To mitigate the increasingly common underutilization of computational resources in modern GPUs, spatial sharing methods enable multiple applications to use them simultaneously. This work presents a comprehensive evaluation of NVIDIA's primary technologies to achieve that goal: Multi-Process Service (MPS) and Multi-Instance GPU (MIG). Our findings reveal a crucial trade-off between MPS's flexibility and MIG's isolation, and provide many key insights for improving the co-execution strategy according to job profiles. In the most favorable scenarios, MPS improves performance by up to 30% and reduces energy by about 20%, using its provisioning option to avoid resource monopolization. However, under memory contention, it suffers severe degradation, worsening performance by around 30%. Conversely, MIG's full hardware isolation resolves memory contention, leading to more consistent improvements, but these gains are tempered by higher overhead, and its rigid scheme can degrade performance in certain cases.
Problem

Research questions and friction points this paper is trying to address.

GPU underutilization
spatial sharing
co-execution
resource contention
performance isolation
Innovation

Methods, ideas, or system contributions that make the work stand out.

spatial co-execution
Multi-Process Service (MPS)
Multi-Instance GPU (MIG)
GPU resource sharing
memory contention
🔎 Similar Papers
No similar papers found.