🤖 AI Summary
This work addresses the fundamental challenge of adapting PCIe SSDs to CXL memory semantics, enabling their use as scalable, byte-addressable working memory. We propose a Type-3 CXL-SSD architecture that implements the CXL 3.0 protocol stack on FPGA and introduces a novel instruction-level semantic annotation mechanism—Determinism and Bufferability—to enable cache behavior control and access determinism while preserving persistence. Integrated with on-die cache co-scheduling, our design achieves tight fusion of storage and memory semantics. Experimental results demonstrate that, compared to PCIe-based memory expanders, our prototype delivers a 10.9× throughput improvement and a 5.4× latency reduction. Under high-locality workloads, its performance approaches that of DRAM. To the best of our knowledge, this is the first SSD-based memory solution for the CXL ecosystem offering simultaneously high performance, rich memory semantics, and strong execution determinism.
📝 Abstract
This paper explores how Compute Express Link (CXL) can transform PCIe-based block storage into a scalable, byte-addressable working memory. We address the challenges of adapting block storage to CXL's memory-centric model by emphasizing cacheability as a key enabler and advocating for Type 3 endpoint devices, referred to as CXL-SSDs. To validate our approach, we prototype a CXL-SSD on a custom FPGA platform and propose annotation mechanisms, Determinism and Bufferability, to enhance performance while preserving data persistency. Our simulation-based evaluation demonstrates that CXL-SSD achieves 10.9x better performance than PCIe-based memory expanders and further reduces latency by 5.4x with annotation enhancements. In workloads with high locality, CXL-SSD approaches DRAM-like performance due to efficient on-chip caching. This work highlights the feasibility of integrating block storage into CXL's ecosystem and provides a foundation for future memory-storage convergence.