🤖 AI Summary
Persistent memory programs often sacrifice performance due to excessive use of flush and fence instructions, making it challenging to balance crash consistency with hardware efficiency. This work proposes a black-box binary rewriting technique that requires neither source code nor manual intervention. By leveraging semantic analysis and performance modeling, the method automatically identifies and optimizes redundant persistence instruction sequences while strictly preserving crash consistency semantics. Experimental evaluation on multiple real-world persistent memory applications demonstrates performance improvements of up to 15%, marking the first fully automated and efficient optimization of synchronization instructions for persistent memory programs.
📝 Abstract
Persistent Memory (PM) is a new storage technology thatbrings high performance, byte addressability, and persistency for a lesser cost than DRAM. Due to cache volatility and store reordering, developers must use explicit instructions (e.g.: flush and fence) to guarantee that the application state remains consistent upon crashes. This is difficult to get right and, in fact, several tools have been created to detect bugs in PM programs. To overcome this difficulty, programmers tend to be overly conservative, for instance, by enforcing unnecessary ordering constraints, which partially forfeits the performance benefits of using PM. In this paper, we study the impact that different combinations of persistency instructions have in several PM programs and found that a specific combination can lead to performance improvements while preserving the original crash-consistency semantics. Based on these results we developed Bentō an automatic and black-box binary rewriter that can boost the performance of existing PM programs by up to 15% with minimal programmer effort.