Accelerating Graph Indexing for ANNS on Modern CPUs

📅 2025-02-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the prohibitively long indexing time and poor CPU-architecture compatibility of graph-based indexes in high-dimensional approximate nearest neighbor search (ANNS), this paper proposes Flash—a hardware-aware compact encoding strategy. Flash innovatively integrates vector quantization with key CPU architectural features, including SIMD parallelism and cache locality, enabling efficient distance computation while maintaining bounded quantization error. By jointly optimizing compact encoding, memory access patterns, and cache-friendly graph construction, Flash achieves 10.4×–22.9× speedup in index construction across eight real-world datasets ranging from 10 million to one billion vectors. Crucially, this acceleration comes without sacrificing retrieval accuracy or query latency—indeed, both are preserved or improved. Flash thus bridges the gap between algorithmic efficiency and modern hardware utilization in large-scale ANNS.

Technology Category

Application Category

📝 Abstract
In high-dimensional vector spaces, Approximate Nearest Neighbor Search (ANNS) is a key component in database and artificial intelligence infrastructures. Graph-based methods, particularly HNSW, have emerged as leading solutions among various ANNS approaches, offering an impressive trade-off between search efficiency and accuracy. Many modern vector databases utilize graph indexes as their core algorithms, benefiting from various optimizations to enhance search performance. However, the high indexing time associated with graph algorithms poses a significant challenge, especially given the increasing volume of data, query processing complexity, and dynamic index maintenance demand. This has rendered indexing time a critical performance metric for users. In this paper, we comprehensively analyze the underlying causes of the low graph indexing efficiency on modern CPUs, identifying that distance computation dominates indexing time, primarily due to high memory access latency and suboptimal arithmetic operation efficiency. We demonstrate that distance comparisons during index construction can be effectively performed using compact vector codes at an appropriate compression error. Drawing from insights gained through integrating existing compact coding methods in the graph indexing process, we propose a novel compact coding strategy, named Flash, designed explicitly for graph indexing and optimized for modern CPU architectures. By minimizing random memory accesses and maximizing the utilization of SIMD (Single Instruction, Multiple Data) instructions, Flash significantly enhances cache hit rates and arithmetic operations. Extensive experiments conducted on eight real-world datasets, ranging from ten million to one billion vectors, exhibit that Flash achieves a speedup of 10.4$ imes$ to 22.9$ imes$ in index construction efficiency, while maintaining or improving search performance.
Problem

Research questions and friction points this paper is trying to address.

High indexing time in graph-based ANNS methods
Distance computation dominates indexing time
Optimizing graph indexing for modern CPU architectures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Flash compact coding strategy
Optimized for modern CPUs
Enhances cache and SIMD utilization