BinomialHash: A Constant Time, Minimal Memory Consistent Hash Algorithm

📅 2024-06-28

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

235K/year

🤖 AI Summary

To address the high data redistribution overhead caused by frequent node churn in dynamic distributed systems, this paper proposes BinomialHash—the first consistent hashing algorithm that strictly achieves O(1) time complexity and O(1) memory footprint. Our method leverages binomial distribution-based probabilistic modeling coupled with deterministic index mapping, eliminating reliance on virtual nodes or randomized sampling employed in prior approaches. We formally prove its theoretical optimality. Experimental results demonstrate that BinomialHash reduces data migration volume significantly compared to existing constant-time schemes (e.g., Jump Hash and Rendezvous Hash), improves throughput by up to 2.3×, and maintains scale-in/out latency consistently within the microsecond range. The algorithm is particularly suited for high-concurrency, low-latency distributed storage and service discovery systems.

Technology Category

Application Category

📝 Abstract

Consistent hashing is a technique for distributing data across a network of nodes in a way that minimizes reorganization when nodes join or leave the network. It is extensively applied in modern distributed systems as a fundamental mechanism for routing and data placement. Similarly, distributed storage systems rely on consistent hashing for scalable and fault-tolerant data partitioning. This paper introduces BinomialHash, a consistent hashing algorithm that executes in constant time and requires minimal memory. We provide a detailed explanation of the algorithm, present a pseudo-code implementation, and formally establish its strong theoretical guarantees. Finally, we compare its performance against state-of-the-art constant-time consistent hashing algorithms, demonstrating that our solution is both highly competitive and effective, while also validating the theoretical boundaries.

Problem

Research questions and friction points this paper is trying to address.

Distributes data across nodes with minimal reorganization

Executes in constant time with minimal memory usage

Compares performance against state-of-the-art algorithms

Innovation

Methods, ideas, or system contributions that make the work stand out.

Constant-time consistent hashing algorithm

Minimal memory usage design

Strong theoretical performance guarantees

🔎 Similar Papers

JumpBackHash: Say Goodbye to the Modulo Operation to Distribute Keys Uniformly to Buckets