Bounding the Fragmentation of B-Trees Subject to Batched Insertions

📅 2026-03-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the low space utilization of B-trees under bulk sequential key insertions by extending Yao’s classical analytical framework. For the first time, it generalizes the theory of random insertions to a bulk insertion model and formally models the leaf-node splitting mechanism. Through probabilistic analysis and combinatorial methods, the study demonstrates that the uniform splitting strategy maintains approximately 69% space utilization across most bulk workloads. For the remaining cases, the authors devise an optimized splitting strategy with provable guarantees of high space efficiency. This research provides both theoretical foundations and practical solutions for achieving high space utilization across diverse bulk insertion scenarios.

Technology Category

Application Category

📝 Abstract
The issue of internal fragmentation in data structures is a fundamental challenge in database design. A seminal result of Yao in this field shows that evenly splitting the leaves of a B-tree against a workload of uniformly random insertions achieves space utilization of around 69%. However, many database applications perform batched insertions, where a small run of consecutive keys is inserted at a single position. We develop a generalization of Yao's analysis to provide rigorous treatment of such batched workloads. Our approach revisits and reformulates the analytical structure underlying Yao's result in a way that enables generalization and is used to argue that even splitting works well for many workloads in our extended class. For the remaining workloads, we develop simple alternative strategies that provably maintain good space utilization.
Problem

Research questions and friction points this paper is trying to address.

B-tree
internal fragmentation
batched insertions
space utilization
database design
Innovation

Methods, ideas, or system contributions that make the work stand out.

B-trees
batched insertions
internal fragmentation
space utilization
Yao's analysis
🔎 Similar Papers
No similar papers found.