Extending the Applicability of Bloom Filters by Relaxing their Parameter Constraints

📅 2025-02-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Bloom filters in key-value stores suffer from parameter rigidity: the number of hash functions must be an integer, and bit-array lengths are typically constrained to powers of two—limiting optimization of false-positive rate (FPR) and space efficiency. This paper proposes two novel variants to overcome these constraints. First, the Rational Bloom Filter (RBF) introduces the first Bloom filter supporting a rational number of hash functions, breaking the integer constraint via fractional hashing modeled using universal hash functions (e.g., Murmur). Second, the Variably-Sized Block Bloom Filter (VSBF) employs a block-based memory layout with binary-efficient index mapping, eliminating the power-of-two restriction on bit-array length. Both designs retain compatibility with standard universal hashing and incorporate rational-parameter modeling for precise FPR control. Experiments demonstrate significantly lower FPR at identical space overhead, scalable computation for large filters, and zero-bit-rate performance approaching the theoretical optimum.

Technology Category

Application Category

📝 Abstract
These days, Key-Value Stores are widely used for scalable data storage. In this environment, Bloom filters serve as an efficient probabilistic data structure for the representation of sets of keys as they allow for set membership queries with controllable false positive rates and no false negatives. For optimal error rates, the right choice of the main parameters, namely the length of the Bloom filter array, the number of hash functions used to map an element to the array's indices, and the number of elements to be inserted in one filter, is crucial. However, these parameters are constrained: The number of hash functions is bounded to integer values, and the length of a Bloom filter is usually chosen to be a power-of-two to allow for efficient modulo operations using binary arithmetics. These modulo calculations are necessary to map from the output universe of the applied universal hash functions, like Murmur, to the set of indices of the Bloom filter. In this paper, we relax these constraints by proposing the Rational Bloom filter, which allows for non-integer numbers of hash functions. This results in optimized fraction-of-zero values for a known number of elements to be inserted. Based on this, we construct the Variably-Sized Block Bloom filters to allow for a flexible filter length, especially for large filters, while keeping computation efficient.
Problem

Research questions and friction points this paper is trying to address.

Extending Bloom filter applicability by relaxing constraints.
Optimizing Bloom filter parameters for better performance.
Introducing flexible filter lengths for efficient computation.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Rational Bloom filter
non-integer hash functions
Variably-Sized Block Bloom filters
🔎 Similar Papers
No similar papers found.