PathWeaver: A High-Throughput Multi-GPU System for Graph-Based Approximate Nearest Neighbor Search

📅 2025-07-22

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing GPU-accelerated approximate nearest neighbor search (ANNS) methods on graphs exhibit poor scalability across multiple GPUs, relying solely on data sharding and independent per-GPU search without coordinated utilization of multi-GPU compute resources. This work introduces the first high-throughput multi-GPU graph ANNS framework. Its core contributions are: (1) a GPU-aware path expansion pipeline leveraging peer-to-peer (P2P) inter-GPU communication to enable iterative cross-GPU coordination; (2) a ghost caching mechanism that improves query initialization point selection; and (3) a direction-guided pruning technique that dynamically eliminates irrelevant nodes, reducing both computational and memory overhead. Evaluated on multiple standard benchmarks at 95% recall, the framework achieves a 3.24× geometric mean speedup over state-of-the-art multi-GPU ANNS systems, with peak acceleration reaching 5.30×.

Technology Category

Application Category

📝 Abstract

Graph-based Approximate Nearest Neighbor Search (ANNS) is widely adopted in numerous applications, such as recommendation systems, natural language processing, and computer vision. While recent works on GPU-based acceleration have significantly advanced ANNS performance, the ever-growing scale of datasets now demands efficient multi-GPU solutions. However, the design of existing works overlooks multi-GPU scalability, resulting in naive approaches that treat additional GPUs as a means to extend memory capacity for large datasets. This inefficiency arises from partitioning the dataset and independently searching for data points similar to the queries in each GPU. We therefore propose PathWeaver, a novel multi-GPU framework designed to scale and accelerate ANNS for large datasets. First, we propose pipelining-based path extension, a GPU-aware pipelining mechanism that reduces prior work's redundant search iterations by leveraging GPU-to-GPU communication. Second, we design ghost staging that leverages a representative dataset to identify optimal query starting points, reducing the search space for challenging queries. Finally, we introduce direction-guided selection, a data selection technique that filters irrelevant points early in the search process, minimizing unnecessary memory accesses and distance computations. Comprehensive evaluations across diverse datasets demonstrate that PathWeaver achieves 3.24$ imes$ geomean speedup and up to 5.30$ imes$ speedup on 95% recall rate over state-of-the-art multi-GPU-based ANNS frameworks.

Problem

Research questions and friction points this paper is trying to address.

Scaling graph-based ANNS for large datasets efficiently

Reducing redundant search iterations in multi-GPU systems

Minimizing unnecessary memory accesses in ANNS queries

Innovation

Methods, ideas, or system contributions that make the work stand out.

Pipelining-based path extension reduces redundant searches

Ghost staging optimizes query starting points

Direction-guided selection minimizes unnecessary computations

🔎 Similar Papers

BANG: Billion-Scale Approximate Nearest Neighbor Search using a Single GPU

2024-01-20arXiv.orgCitations: 2

Authors to Follow