CXL Topology-Aware and Expander-Driven Prefetching: Unlocking SSD Performance

📅 2025-05-24

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address performance bottlenecks in CXL-interconnected SSDs—namely high latency and excessive CPU overhead from software prefetching—this paper proposes an LLC-offloaded heterogeneous prefetching architecture. Our approach introduces, for the first time, an expander-driven edge-side prefetching engine that jointly leverages CXL multi-level switching topology awareness and a back-invalidation cache coherence protocol to enable low-overhead, high-accuracy localized prefetching. We further establish an end-to-end latency model and a quantitative methodology for evaluating prefetching timeliness. Experimental results demonstrate 9.0× and 14.7× speedups for graph analytics and SPEC CPU benchmarks, respectively—substantially outperforming state-of-the-art CXL-SSD pooling-based prefetching schemes. The architecture reduces dependency on SSD accesses and significantly increases host cache direct-hit rates.

Technology Category

Application Category

📝 Abstract

Integrating compute express link (CXL) with SSDs allows scalable access to large memory but has slower speeds than DRAMs. We present ExPAND, an expander-driven CXL prefetcher that offloads last-level cache (LLC) prefetching from host CPU to CXL-SSDs. ExPAND uses a heterogeneous prediction algorithm for prefetching and ensures data consistency with CXL.mem's back-invalidation. We examine prefetch timeliness for accurate latency estimation. ExPAND, being aware of CXL multi-tiered switching, provides end-to-end latency for each CXL-SSD and precise prefetch timeliness estimations. Our method reduces CXL-SSD reliance and enables direct host cache access for most data. ExPAND enhances graph application performance and SPEC CPU's performance by 9.0$ imes$ and 14.7$ imes$, respectively, surpassing CXL-SSD pools with diverse prefetching strategies.

Problem

Research questions and friction points this paper is trying to address.

Improving SSD performance with CXL prefetching

Reducing host CPU load via CXL-SSD prefetch offloading

Enhancing data consistency in CXL memory systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Expander-driven CXL prefetcher offloads LLC prefetching

Heterogeneous prediction algorithm ensures data consistency

CXL topology-aware for end-to-end latency estimation

🔎 Similar Papers

Exploring and Evaluating Real-world CXL: Use Cases and System Adoption

2024-05-23arXiv.orgCitations: 7

Authors to Follow