On the Effectiveness of Graph Reordering for Accelerating Approximate Nearest Neighbor Search on GPU

📅 2025-08-21

📈 Citations: 0

✨ Influential: 0

career value

181K/year

🤖 AI Summary

Memory layout effects on GPU-accelerated graph-based Approximate Nearest Neighbor Search (ANNS) have been largely overlooked, despite their significant impact on memory access efficiency. Method: This paper systematically uncovers the intrinsic relationship between graph structural properties and GPU memory access patterns. We propose the first unified evaluation framework for GPU graph ANNS, comprising a graph adapter for standardized heterogeneous graph representation, a GPU-optimized traversal engine, and multiple graph reordering strategies—accompanied by quantitative metrics to assess memory layout efficacy. Our reordering methods are orthogonal to existing graph indices and require no modification to underlying algorithms. Contribution/Results: Extensive experiments across multiple datasets and state-of-the-art graph indices (e.g., HNSW, NSG) demonstrate up to 15% higher query throughput with zero accuracy loss, empirically validating that memory layout optimization is critical for accelerating graph ANNS on GPUs.

Technology Category

Application Category

📝 Abstract

We present the first systematic investigation of graph reordering effects for graph-based Approximate Nearest Neighbor Search (ANNS) on a GPU. While graph-based ANNS has become the dominant paradigm for modern AI applications, recent approaches focus on algorithmic innovations while neglecting memory layout considerations that significantly affect execution time. Our unified evaluation framework enables comprehensive evaluation of diverse reordering strategies across different graph indices through a graph adapter that converts arbitrary graph topologies into a common representation and a GPU-optimized graph traversal engine. We conduct a comprehensive analysis across diverse datasets and state-of-the-art graph indices, introducing analysis metrics that quantify the relationship between structural properties and memory layout effectiveness. Our GPU-targeted reordering achieves up to 15$%$ QPS improvements while preserving search accuracy, demonstrating that memory layout optimization operates orthogonally to existing algorithmic innovations. We will release all code upon publication to facilitate reproducibility and foster further research.

Problem

Research questions and friction points this paper is trying to address.

Investigating graph reordering effects on GPU-accelerated nearest neighbor search

Evaluating memory layout optimization strategies for graph-based ANNS algorithms

Analyzing structural properties affecting memory layout effectiveness in GPU execution

Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph reordering strategies for memory optimization

Unified evaluation framework with GPU adapter

GPU-optimized graph traversal engine implementation

🔎 Similar Papers

BANG: Billion-Scale Approximate Nearest Neighbor Search using a Single GPU