🤖 AI Summary
This work uncovers a previously unknown implicit conflict between two hardware-based defenses—TORC and DSRC—arising from state leakage: DSRC exposes remote cache coherence states (e.g., Exclusive) into the core, bypassing TORC’s timing obfuscation and enabling attackers to infer remote cache line presence via redo operations, thereby establishing a novel covert channel. This is the first demonstration that inter-defense state leakage can nullify timing obfuscation guarantees. To address this, we propose two low-overhead fixes: (1) suppressing propagation of remote Exclusive states, or (2) eliminating their generation entirely. Evaluated on GEM5 with SPECrate 2017 and PARSEC benchmarks, the fixes incur average performance overheads of under 32% and 2.8%, respectively—the latter approaching zero. Our findings reveal critical pitfalls in composing hardware security mechanisms and provide practical, deployable mitigations that preserve both security and efficiency.
📝 Abstract
Microarchitectural attacks are a significant concern, leading to many hardware-based defense proposals. However, different defenses target different classes of attacks, and their impact on each other has not been fully considered. To raise awareness of this problem, we study an interaction between two state-of-the art defenses in this paper, timing obfuscations of remote cache lines (TORC) and delaying speculative changes to remote cache lines (DSRC). TORC mitigates cache-hit based attacks and DSRC mitigates speculative coherence state change attacks. We observe that DSRC enables coherence information to be retrieved into the processor core, where it is out of the reach of timing obfuscations to protect. This creates an unforeseen consequence that redo operations can be triggered within the core to detect the presence or absence of remote cache lines, which constitutes a security vulnerability. We demonstrate that a new covert channel attack is possible using this vulnerability. We propose two ways to mitigate the attack, whose performance varies depending on an application's cache usage. One way is to never send remote exclusive coherence state (E) information to the core even if it is created. The other way is to never create a remote E state, which is responsible for triggering redos. We demonstrate the timing difference caused by this microarchitectural defense assumption violation using GEM5 simulations. Performance evaluation on SPECrate 2017 and PARSEC benchmarks of the two fixes show less than 32% average overhead across both sets of benchmarks. The repair which prevented the creation of remote E state had less than 2.8% average overhead.