🤖 AI Summary
This work investigates the minimum space lower bound for approximate membership query data structures—such as Bloom filter variants—that support dynamic insertions and deletions, under a relaxed fault-tolerant model permitting duplicate insertions or deletions of non-members.
Method: Leveraging information-theoretic analysis, combinatorial counting, and binomial entropy bounds, we construct probabilistic worst-case instances to derive a tight lower bound.
Contribution/Results: We establish the first tight information-theoretic lower bound of Ω(log C(u,n)) bits, where u is the universe size and n the current number of elements. This bound strictly demonstrates that fault tolerance inherently incurs asymptotically larger space overhead than classical models (e.g., Bender et al.), with a multiplicative penalty of at least a linear factor in n. Our result reveals a fundamental, unavoidable trade-off between fault tolerance and space efficiency in dynamic approximate set representations, providing a new theoretical benchmark for the design of such data structures.
📝 Abstract
Designs of data structures for approximate membership queries with false-positive errors that support both insertions and deletions stipulate the following two conditions: (1) Duplicate insertions are prohibited, i.e., it is prohibited to insert an element $x$ if $x$ is currently a member of the dataset. (2) Deletions of nonelements are prohibited, i.e., it is prohibited to delete $x$ if $x$ is not currently a member of the dataset. Under these conditions, the space required for the approximate representation of a datasets of cardinality $n$ with a false-positive probability of $epsilon^{+}$ is at most $(1+o(1))ncdotlog_2 (1/epsilon^{+}) + O(n)$ bits [Bender et al., 2018; Bercea and Even, 2019]. We prove that if these conditions are lifted, then the space required for the approximate representation of datasets of cardinality $n$ from a universe of cardinality $u$ is at least $frac 12 cdot (1-epsilon^{+} -frac 1n)cdot log inom{u}{n} -O(n)$ bits.