Garibaldi: A Pairwise Instruction-Data Management for Enhancing Shared Last-Level Cache Performance in Server Workloads

📅 2025-05-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Server workloads exhibit large instruction footprints, causing severe last-level cache (LLC) instruction misses; this stalls the frontend due to instruction fetch latency, while conventional LLC designs prioritize data management and neglect the frontend performance bottleneck imposed by instruction misses. Method: This paper proposes an instruction–data co-aware LLC management mechanism: (1) a novel instruction–data pairing hotness-aware model; (2) a cost-sensitive selective protection policy for high-cost instruction cache lines; and (3) conservative prefetching of associated data lines upon instruction misses to alleviate frontend stalls. Contribution/Results: Evaluated on the Mockingjay hardware prototype under 40-core server configurations and 16 real-world service workloads, the proposed design improves performance by 13.2% over the baseline LLC and by 6.1% over the state-of-the-art Mockingjay design.

Technology Category

Application Category

📝 Abstract
Modern CPUs suffer from the frontend bottleneck because the instruction footprint of server workloads exceeds the private cache capacity. Prior works have examined the CPU components or private cache to improve the instruction hit rate. The large footprint leads to significant cache misses not only in the core and faster-level cache but also in the last-level cache (LLC). We observe that even with an advanced branch predictor and instruction prefetching techniques, a considerable amount of instruction accesses descend to the LLC. However, state-of-the-art LLC designs with elaborate data management overlook handling the instruction misses that precede corresponding data accesses. Specifically, when an instruction requiring numerous data accesses is missed, the frontend of a CPU should wait for the instruction fetch, regardless of how much data are present in the LLC. To preserve hot instructions in the LLC, we propose Garibaldi, a novel pairwise instruction-data management scheme. Garibaldi tracks the hotness of instruction accesses by coupling it with that of data accesses and adopts management techniques. On the one hand, this scheme includes a selective protection mechanism that prevents the cache evictions of high-cost instruction cachelines. On the other hand, in the case of unprotected instruction line misses, Garibaldi conservatively issues prefetch requests of the paired data lines while handling those misses. In our experiments, we evaluate Garibaldi with 16 server workloads on a 40-core machine. We also implement Garibaldi on top of a modern LLC design, including Mockingjay. Garibaldi improves 13.2% and 6.1% of CPU performance on baseline LLC design and Mockingjay, respectively.
Problem

Research questions and friction points this paper is trying to address.

Reducing instruction cache misses in last-level cache (LLC) for server workloads
Managing instruction-data pairs to enhance LLC performance in CPUs
Improving CPU frontend efficiency by preserving hot instructions in LLC
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pairwise instruction-data management scheme
Selective protection for high-cost instructions
Prefetch paired data during instruction misses
🔎 Similar Papers
No similar papers found.