Parallel Metric Skiplists and Nearest Neighbor Search

📅 2026-06-02

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

The original construction of metric skip lists exhibits inherent sequentiality, which hinders parallelization and limits their efficiency in large-scale nearest neighbor search. This work proposes the first work-efficient, polylogarithmic-span parallel construction algorithm for metric skip lists. Relying only on a constant expansion rate—without requiring a bounded aspect ratio—and leveraging a divide-and-conquer strategy combined with randomized analysis, the algorithm achieves an expected $O(n \log n)$ total work and polylogarithmic depth with high probability. The method supports nearest neighbor search and several downstream applications, including bichromatic closest pair, density-based clustering, and k-nearest neighbor graph construction, offering the first solution that simultaneously guarantees both work efficiency and low parallel depth for these tasks.

📝 Abstract

The metric skip-list is a data structure designed for efficient nearest and $k$-nearest neighbor search in metric spaces. For many real-world datasets with reasonable distributions - specifically, those with a constant expansion rate - it supports $\tilde{O}(n)$ construction time and $O(k\log n)$ query time, where $n$ is the input size and $k$ is the number of nearest neighbors in queries. Notably, unlike alternative approaches, it does not require a bounded aspect ratio, making it more flexible for input data distributions. However, the inherently sequential nature of its original construction has, to our knowledge, precluded any existing parallel algorithm. In this paper, we present highly parallel and work-efficient algorithms for constructing metric skip lists. Under the assumption of a constant expansion rate, our approach achieves an expected work of $O(n \log n)$ and a polylogarithmic span with high probability. Our design is based on novel algorithmic insights that improves the sequential procedure, enabling a divide-and-conquer strategy that facilitates parallelism while maintaining efficiency. With our algorithms, we can also support improved bounds for relevant applications using nearest neighbor as building blocks, including bichromatic closest pair (BCP), density-based clustering, and $k$-NN graph construction, among others. To our knowledge, many of these results represent the first solutions to achieve both work efficiency and polylogarithmic span, relying solely on the assumption of a constant expansion rate.

Problem

Research questions and friction points this paper is trying to address.

parallel algorithm

metric skiplist

nearest neighbor search

work efficiency

span

Innovation

Methods, ideas, or system contributions that make the work stand out.

parallel metric skiplist

nearest neighbor search

work-efficient parallel algorithm