The Hitchhiker's Guide to Programming and Optimizing CXL-Based Heterogeneous Systems

📅 2024-11-05
🏛️ arXiv.org
📈 Citations: 2
Influential: 1
📄 PDF
🤖 AI Summary
The performance characteristics and architectural behaviors of cache-coherent interconnects—particularly Compute Express Link (CXL)—remain poorly understood in multi-vendor heterogeneous systems (e.g., CPU + CXL memory devices). Method: We construct a cross-vendor heterogeneous server cluster and propose Heimdall, the first fine-grained memory performance analysis framework tailored for CXL systems, accompanied by a lightweight microbenchmark suite. Through empirical measurement of CXL 3.0 protocol stack–hardware co-behavior, we systematically characterize memory latency, bandwidth, and coherence semantics across mainstream CXL devices. Contribution/Results: We uncover three previously unknown architectural blind spots and implicit protocol stack constraints. Leveraging these insights, we devise practical, workload-aware memory scheduling strategies for database and AI inference workloads. Our work provides both theoretical foundations and actionable guidelines for designing and optimizing cache-coherent heterogeneous systems.

Technology Category

Application Category

📝 Abstract
We present a thorough analysis of the use of CXL-based heterogeneous systems. We built a cluster of server systems that combines different vendor's CPUs and various types of CXL devices. We further developed a heterogeneous memory benchmark suite, Heimdall, to profile the performance of such heterogeneous systems. By leveraging Heimdall, we unveiled the detailed architecture design in these systems, drew observations on optimizing performance for workloads, and pointed out directions for future development of CXL-based heterogeneous systems.
Problem

Research questions and friction points this paper is trying to address.

Analyze performance of cache-coherent heterogeneous systems
Compare CXL, NVLink-C2C, and Infinity Fabric interconnects
Optimize workloads for future heterogeneous system designs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzed cache-coherent links: CXL, NVLink-C2C, Infinity Fabric
Developed Heimdall benchmark for heterogeneous memory profiling
Unveiled architecture designs for performance optimization
🔎 Similar Papers
No similar papers found.
Z
Zixuan Wang
University of California San Diego
S
Suyash Mahar
University of California San Diego
L
Luyi Li
University of California San Diego
J
Jangseon Park
University of California San Diego
J
Jinpyo Kim
University of California San Diego
T
Theodore Michailidis
University of California San Diego
Y
Yue Pan
University of California San Diego
Tajana Rosing
Tajana Rosing
Distinguished Professor, UCSD
computer architecturecyber-physical systemssystem energy efficiency
Dean Tullsen
Dean Tullsen
Department of Computer Science and Engineering, UC San Diego
computer architecturesecurity
S
Steven Swanson
University of California San Diego
K
Kyung Chang Ryoo
Samsung
S
Sungjoo Park
Samsung
Jishen Zhao
Jishen Zhao
Professor at University of California, San Diego
Computer ArchitectureComputer SystemsMachine Learning SystemsElectronic Design Automation