BinomialHash: A Constant Time, Minimal Memory Consistent Hash Algorithm

๐Ÿ“… 2024-06-28
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the high data redistribution overhead caused by frequent node churn in dynamic distributed systems, this paper proposes BinomialHashโ€”the first consistent hashing algorithm that strictly achieves O(1) time complexity and O(1) memory footprint. Our method leverages binomial distribution-based probabilistic modeling coupled with deterministic index mapping, eliminating reliance on virtual nodes or randomized sampling employed in prior approaches. We formally prove its theoretical optimality. Experimental results demonstrate that BinomialHash reduces data migration volume significantly compared to existing constant-time schemes (e.g., Jump Hash and Rendezvous Hash), improves throughput by up to 2.3ร—, and maintains scale-in/out latency consistently within the microsecond range. The algorithm is particularly suited for high-concurrency, low-latency distributed storage and service discovery systems.

Technology Category

Application Category

๐Ÿ“ Abstract
Consistent hashing is a technique for distributing data across a network of nodes in a way that minimizes reorganization when nodes join or leave the network. It is extensively applied in modern distributed systems as a fundamental mechanism for routing and data placement. Similarly, distributed storage systems rely on consistent hashing for scalable and fault-tolerant data partitioning. This paper introduces BinomialHash, a consistent hashing algorithm that executes in constant time and requires minimal memory. We provide a detailed explanation of the algorithm, present a pseudo-code implementation, and formally establish its strong theoretical guarantees. Finally, we compare its performance against state-of-the-art constant-time consistent hashing algorithms, demonstrating that our solution is both highly competitive and effective, while also validating the theoretical boundaries.
Problem

Research questions and friction points this paper is trying to address.

Distributes data across nodes with minimal reorganization
Executes in constant time with minimal memory usage
Compares performance against state-of-the-art algorithms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Constant-time consistent hashing algorithm
Minimal memory usage design
Strong theoretical performance guarantees
๐Ÿ”Ž Similar Papers
M
Massimo Coluzzi
Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Lugano, Switzerland
A
Amos Brocco
Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Lugano, Switzerland
Alessandro Antonucci
Alessandro Antonucci
Senior Lecturer-Researcher at IDSIA
Probabilistic graphical modelsprobabilistic circuitsmachine learning