🤖 AI Summary
Learning-augmented data structures optimize query performance via frequency prediction, yet their adaptive layouts inherently leak historical access patterns; existing approaches fail to jointly guarantee strong history independence, robustness against distribution shifts, and support for dynamic updates. This paper introduces the first privacy-preserving and efficient learning-augmented hash table. We propose a thresholding mechanism to enhance the prediction model’s robustness to distributional drift, and design a pairwise reordering strategy that strictly enforces strong history independence—ensuring memory layout depends solely on current key-value contents and reveals no information about past insertions or deletions. Experimental results demonstrate that our structure achieves near-optimal query performance while significantly outperforming baseline methods in security metrics. We thereby establish the feasibility and practicality of privacy-preserving learned indexing.
📝 Abstract
Learning-augmented data structures use predicted frequency estimates to retrieve frequently occurring database elements faster than standard data structures. Recent work has developed data structures that optimally exploit these frequency estimates while maintaining robustness to adversarial prediction errors. However, the privacy and security implications of this setting remain largely unexplored.
In the event of a security breach, data structures should reveal minimal information beyond their current contents. This is even more crucial for learning-augmented data structures, whose layout adapts to the data. A data structure is history independent if its memory representation reveals no information about past operations except what is inferred from its current contents. In this work, we take the first step towards privacy and security guarantees in this setting by proposing the first learning-augmented data structure that is strongly history independent, robust, and supports dynamic updates.
To achieve this, we introduce two techniques: thresholding, which automatically makes any learning-augmented data structure robust, and pairing, a simple technique that provides strong history independence in the dynamic setting. Our experimental results demonstrate a tradeoff between security and efficiency but are still competitive with the state of the art.