Learned Static Function Data Structures

📅 2025-10-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the storage bottleneck of static function data structures. Conventional static functions explicitly store the key set, incurring space overhead bounded below by the zero-order empirical entropy. Method: We propose Learned Static Functions (LSFs), which—under the assumptions of a fixed key set, support only point queries, and permitting arbitrary outputs for out-of-set keys—eliminate explicit key-set storage. LSFs employ machine learning to model the key-value joint probability distribution, generate key-specific adaptive prefix codes, and embed these codes into classical static function structures. Contribution/Results: The core innovation lies in co-designing probabilistic prediction with deterministic data structures to achieve model-driven compact representation. Experiments demonstrate up to one-order-of-magnitude space reduction on real-world datasets and up to three-orders-of-magnitude compression on synthetic data, surpassing entropy-based lower bounds.

Technology Category

Application Category

📝 Abstract
We consider the task of constructing a data structure for associating a static set of keys with values, while allowing arbitrary output values for queries involving keys outside the set. Compared to hash tables, these so-called static function data structures do not need to store the key set and thus use significantly less memory. Several techniques are known, with compressed static functions approaching the zero-order empirical entropy of the value sequence. In this paper, we introduce learned static functions, which use machine learning to capture correlations between keys and values. For each key, a model predicts a probability distribution over the values, from which we derive a key-specific prefix code to compactly encode the true value. The resulting codeword is stored in a classic static function data structure. This design allows learned static functions to break the zero-order entropy barrier while still supporting point queries. Our experiments show substantial space savings: up to one order of magnitude on real data, and up to three orders of magnitude on synthetic data.
Problem

Research questions and friction points this paper is trying to address.

Constructing memory-efficient data structures for key-value mappings
Breaking the zero-order entropy barrier using machine learning
Enabling compact value encoding through key-value correlation modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Learned static functions use machine learning models
Key-specific prefix codes compress value encoding
Breaks zero-order entropy barrier for compact storage
🔎 Similar Papers
No similar papers found.